Re: profiling latency in large org-mode buffers (under both main & org-fold feature)

2022-03-19 Thread Ihor Radchenko
Max Nikulin  writes:

> It is up to you to choose at which level your prefer to optimize the 
> code. And it is only my opinion (I do not insist) that benefits from 
> changes in low level code might be much more significant. I like the 
> idea of markers, but their current implementation is a source of pain.
>
>> (note that Nicolas did not use
>> markers to store boundaries of org elements).
>
> E.g. export-related code certainly does need markers. You experienced 
> enough problems with attempts to properly invalidate cache when lower 
> level is not supposed to provide appropriate facilities.

I understand your argument. However, I feel discouraged to contribute to
Emacs devel because, most of Org users will not benefit from such
contribution for a long time. Not until next several major versions of
Emacs will be released. So, I currently prefer to contribute some
backwards-compatible high-level code and leave Emacs core for future.

Best,
Ihor




Re: profiling latency in large org-mode buffers (under both main & org-fold feature)

2022-03-03 Thread Max Nikulin

On 02/03/2022 22:12, Ihor Radchenko wrote:

Max Nikulin writes:

I tend to agree after reading the code again.
I tried to play around with that marker loop. It seems that the loop
should not be mindlessly disabled, but it can be sufficient to check
only a small number of markers in front of the marker list. The cached
temporary markers are always added in front of the list.


I did not try to say that the loop over markers may be just thrown away. 
By the way, for sequential scan (with no backward searches) single 
marker might work reasonably well.


Some kind of index for fast mapping between bytes and positions should 
be maintained at the buffer level. I hope, when properly designed, such 
structure may minimize amount of recalculations on each edit. I mean 
some hierarchical structure of buffer fragments and markers keeps 
relative offsets from beginning of the fragment they belong to. 
Hierarchy of fragments is enough to provide initial estimation of 
position for byte index. Only markers within the fragment that is 
changed need immediate update.



I am currently using a custom version of org-ql utilising the new
element cache. It is substantially faster compared to current
org-refile-get-targets. The org-ql version runs in <2 seconds at worst
when calculating all refile targets from scratch, while
org-refile-get-targets is over 10sec. org-ql version gives 0 noticeable
latency when there is an extra text query to narrow down the refile
targets. So, is it certainly possible to improve the performance just
using high-level org-element cache API + regexp search without markers.


It is up to you to choose at which level your prefer to optimize the 
code. And it is only my opinion (I do not insist) that benefits from 
changes in low level code might be much more significant. I like the 
idea of markers, but their current implementation is a source of pain.



(note that Nicolas did not use
markers to store boundaries of org elements).


E.g. export-related code certainly does need markers. You experienced 
enough problems with attempts to properly invalidate cache when lower 
level is not supposed to provide appropriate facilities.






Re: profiling latency in large org-mode buffers (under both main & org-fold feature)

2022-03-02 Thread Ihor Radchenko
Max Nikulin  writes:

> On 27/02/2022 13:43, Ihor Radchenko wrote:
>> 
>> Now, I did an extended profiling of what is happening using perf:
>> 
>>   6.20%   [.] buf_bytepos_to_charpos
>
> Maybe I am interpreting such results wrongly, but it does not look like 
> a bottleneck. Anyway thank you very much for such efforts, however it is 
> unlikely that I will join to profiling in near future.

The perf data I provided is a bit tricky. I recorded statistics over the
whole Emacs session + used fairly small number of iterations in your
benchmark code.

Now, I repeated the testing plugging perf to Emacs only during the
benchmark execution:

With refile cache and markers:
22.82%  emacs-29.0.50.1  emacs-29.0.50.1   [.] 
buf_bytepos_to_charpos
16.68%  emacs-29.0.50.1  emacs-29.0.50.1   [.] 
rpl_re_search_2
 8.02%  emacs-29.0.50.1  emacs-29.0.50.1   [.] 
re_match_2_internal
 6.93%  emacs-29.0.50.1  emacs-29.0.50.1   [.] Fmemq
 4.05%  emacs-29.0.50.1  emacs-29.0.50.1   [.] 
allocate_vectorlike
 1.88%  emacs-29.0.50.1  emacs-29.0.50.1   [.] 
mark_object

Without refile cache:
17.25%  emacs-29.0.50.1  emacs-29.0.50.1 [.] 
rpl_re_search_2
15.84%  emacs-29.0.50.1  emacs-29.0.50.1 [.] 
buf_bytepos_to_charpos
 8.89%  emacs-29.0.50.1  emacs-29.0.50.1 [.] 
re_match_2_internal
 8.00%  emacs-29.0.50.1  emacs-29.0.50.1 [.] Fmemq
 4.35%  emacs-29.0.50.1  emacs-29.0.50.1 [.] 
allocate_vectorlike
 2.01%  emacs-29.0.50.1  emacs-29.0.50.1 [.] 
mark_object

Percents should be adjusted for larger execution time in the first
dataset, but otherwise it is clear that buf_bytepos_to_charpos dominates
the time delta.

>> I am not sure if I understand the code correctly, but that loop is
>> clearly scaling performance with the number of markers
>
> I may be terribly wrong, but it looks like an optimization attempt that 
> may actually ruin performance. My guess is the following. Due to 
> multibyte characters position in buffer counted in characters may 
> significantly differ from index in byte sequence. Since markers have 
> both values bytepos and charpos, they are used (when available) to 
> narrow down initial estimation interval [0, buffer size) to nearest 
> existing markers. The code below even creates temporary markers to make 
> next call of the function faster.

I tend to agree after reading the code again.
I tried to play around with that marker loop. It seems that the loop
should not be mindlessly disabled, but it can be sufficient to check
only a small number of markers in front of the marker list. The cached
temporary markers are always added in front of the list.

Limiting the number of checked markers to 10, I got the following
result:

With threshold and refile cache:
| 9.5.2  ||   ||
| nm-tst |   28.060029337 | 4 | 1.842760862996 |
| org-refile-get-targets | 3.244561543997 | 0 |0.0 |
| nm-tst | 33.64825913704 | 4 | 1.230431054003 |
| org-refile-cache-clear |0.034879062 | 0 |0.0 |
| nm-tst |   23.974124596 | 5 | 1.429148814996 |

Markers add +~5.6sec.

Original Emacs code and refile cache:
| 9.5.2  |  |   ||
| nm-tst | 29.494383528 | 4 | 3.036850853002 |
| org-refile-get-targets |  3.635947646 | 1 | 0.454247973002 |
| nm-tst | 36.537926593 | 4 | 1.129757634998 |
| org-refile-cache-clear | 0.0096653649 | 0 |0.0 |
| nm-tst | 23.283457105 | 4 | 1.053649649997 |

Markers add +7sec.

The improvement is there, though markers still somehow come into play. I
speculate that limiting the number of checked markers might also force
adding extra temporary markers to the list, but I haven't looked into
that possibility for now. It might be better to discuss with emacs-devel
before trying too hard.

>> Finally, FYI. I plan to work on an alternative mechanism to access Org
>> headings - generic Org query library. It will not use markers and
>> implement ideas from org-ql. org-refile will eventually use that generic
>> library instead of current mechanism.
>
> I suppose that markers might be implemented in an efficient way, and 
> much better performance may be achieved when low-level data structures 
> are accessible. I am in doubts concerning attempts to create something 
> that resembles markers but based purely on high-level API.

I am currently using a custom version of org-ql utilising the new
element cache. It is substantially faster compared to current
org-refile-get-targets. The 

Re: profiling latency in large org-mode buffers (under both main & org-fold feature)

2022-03-02 Thread Max Nikulin

On 27/02/2022 13:43, Ihor Radchenko wrote:


Now, I did an extended profiling of what is happening using perf:

  6.20%   [.] buf_bytepos_to_charpos


Maybe I am interpreting such results wrongly, but it does not look like 
a bottleneck. Anyway thank you very much for such efforts, however it is 
unlikely that I will join to profiling in near future.



buf_bytepos_to_charpos contains the following loop:

   for (tail = BUF_MARKERS (b); tail; tail = tail->next)
 {
   CONSIDER (tail->bytepos, tail->charpos);

   /* If we are down to a range of 50 chars,
 don't bother checking any other markers;
 scan the intervening chars directly now.  */
   if (best_above - bytepos < distance
   || bytepos - best_below < distance)
break;
   else
 distance += BYTECHAR_DISTANCE_INCREMENT;
 }

I am not sure if I understand the code correctly, but that loop is
clearly scaling performance with the number of markers


I may be terribly wrong, but it looks like an optimization attempt that 
may actually ruin performance. My guess is the following. Due to 
multibyte characters position in buffer counted in characters may 
significantly differ from index in byte sequence. Since markers have 
both values bytepos and charpos, they are used (when available) to 
narrow down initial estimation interval [0, buffer size) to nearest 
existing markers. The code below even creates temporary markers to make 
next call of the function faster.


It seems, buffers do not have any additional structures that track size 
in bytes and in characters of spans (I would not expect that 
representation of whole buffer in memory is single contiguous byte 
array). When there are no markers at all, the function has to iterate 
over each character and count its length.


The problem is that when the buffer has a lot of markers far aside from 
the position passed as argument, then iteration over markers just 
consumes CPU with no significant improvement of original estimation of 
boundaries.


If markers were organized in a tree than search would be much faster (at 
least for buffers with a lot of markers.


In some cases such function may take a hint: previous known 
bytepos+charpos pair.


I hope I missed something, but what I can expect from the code of 
buf_bytepos_to_charpos is that it is necessary to iterate over all 
markers to update positions after each typed character.



Finally, FYI. I plan to work on an alternative mechanism to access Org
headings - generic Org query library. It will not use markers and
implement ideas from org-ql. org-refile will eventually use that generic
library instead of current mechanism.


I suppose that markers might be implemented in an efficient way, and 
much better performance may be achieved when low-level data structures 
are accessible. I am in doubts concerning attempts to create something 
that resembles markers but based purely on high-level API.





Re: profiling latency in large org-mode buffers (under both main & org-fold feature)

2022-02-26 Thread Ihor Radchenko
Max Nikulin  writes:

>> Max Nikulin writes:
>>> Actually I suspect that markers may have a similar problem during regexp
>>> searches. I am curious if it is possible to invoke a kind of "vacuum"
>>> (in SQL parlance). Folding all headings and resetting refile cache does
>>> not restore performance to the initial state at session startup. Maybe
>>> it is effect of incremental searches.
>> 
>> I doubt that markers have anything to do with regexp search itself
>> (directly). They should only come into play when editing text in buffer,
>> where their performance is also O(N_markers).
>
> I believed, your confirmed my conclusion earlier:
>
> Ihor Radchenko. Re: [BUG] org-goto slows down org-set-property.
> Sun, 11 Jul 2021 19:49:08 +0800.
> https://list.orgmode.org/orgmode/87lf6dul3f.fsf@localhost/

I confirmed that invoking org-refile-get-targets slows down your nm-tst
looping over the headlines.

However, the issue is not with outline-next-heading there. Profiling
shows that the slowdown mostly happens in org-get-property-block

I have looked into regexp search C source and I did not find anything
that could depend on the number markers in buffer.
After further analysis now (after your email), I found that I may be
wrong and regexp search might actually be affected.

Now, I did an extended profiling of what is happening using perf:

;; perf cpu with refile cache (using your previous code on my largest Org 
buffer)
19.68%   [.] mark_object
 6.20%   [.] buf_bytepos_to_charpos
 5.66%   [.] re_match_2_internal
 5.33%   [.] exec_byte_code
 5.07%   [.] rpl_re_search_2
 3.09%   [.] Fmemq
 2.56%   [.] allocate_vectorlike
 1.86%   [.] sweep_vectors
 1.47%   [.] mark_objects
 1.45%   [.] pdumper_marked_p_impl

;; perf cpu without refile cache (removing getting refile targets from the code)
18.79%   [.] mark_object
 8.23%   [.] re_match_2_internal
 5.88%   [.] rpl_re_search_2
 4.06%   [.] buf_bytepos_to_charpos
 3.06%   [.] Fmemq
 2.45%   [.] allocate_vectorlike
 1.63%   [.] exec_byte_code
 1.50%   [.] pdumper_marked_p_impl

The bottleneck appears to be buf_bytepos_to_charpos, called by
BYTE_TO_CHAR macro, which, in turn, is used by set_search_regs

buf_bytepos_to_charpos contains the following loop:

  for (tail = BUF_MARKERS (b); tail; tail = tail->next)
{
  CONSIDER (tail->bytepos, tail->charpos);

  /* If we are down to a range of 50 chars,
 don't bother checking any other markers;
 scan the intervening chars directly now.  */
  if (best_above - bytepos < distance
  || bytepos - best_below < distance)
break;
  else
distance += BYTECHAR_DISTANCE_INCREMENT;
}

I am not sure if I understand the code correctly, but that loop is
clearly scaling performance with the number of markers

Finally, FYI. I plan to work on an alternative mechanism to access Org
headings - generic Org query library. It will not use markers and
implement ideas from org-ql. org-refile will eventually use that generic
library instead of current mechanism.

Best,
Ihor




Re: profiling latency in large org-mode buffers (under both main & org-fold feature)

2022-02-26 Thread Jean Louis
Open up XMPP group for Org mode, that Jabber chat is lightweight and accessible 
through Emacs jabber.el and plethora of other applications.

Don't forget to include Org links to XMPP groups.



On February 22, 2022 5:33:13 AM UTC, Ihor Radchenko  wrote:
>Samuel Wales  writes:
>
>> i have been dealing with latency also, often in undo-tree.  this
>might
>> be a dumb suggestion, but is it related to org file size?  my files
>> have not really grown /that/ much but maybe you could bisect one.  as
>> opposed to config.
>
>I am wondering if many people in the list experience latency issues.
>Maybe we can organise an online meeting (jitsi or BBB) and collect the
>common causes/ do online interactive debugging?
>
>Best,
>Ihor


Jean



Re: profiling latency in large org-mode buffers (under both main & org-fold feature)

2022-02-26 Thread Max Nikulin

On 26/02/2022 14:45, Ihor Radchenko wrote:


I think we have a misunderstanding here. That page does not contain much
of technical details. Rather a history.


Thank you for clarification. Certainly originally I had a hope to get 
some explanation why it was not implemented in a more efficient way. At 
first I read starting part of the text. It was still interesting to read 
the story that due to delay of Emacs release people had to fork it into 
Lucid. I did not know that XEmacs was a successor of Lucid.



Max Nikulin writes:

Actually I suspect that markers may have a similar problem during regexp
searches. I am curious if it is possible to invoke a kind of "vacuum"
(in SQL parlance). Folding all headings and resetting refile cache does
not restore performance to the initial state at session startup. Maybe
it is effect of incremental searches.


I doubt that markers have anything to do with regexp search itself
(directly). They should only come into play when editing text in buffer,
where their performance is also O(N_markers).


I believed, your confirmed my conclusion earlier:

Ihor Radchenko. Re: [BUG] org-goto slows down org-set-property.
Sun, 11 Jul 2021 19:49:08 +0800.
https://list.orgmode.org/orgmode/87lf6dul3f.fsf@localhost/




Re: profiling latency in large org-mode buffers (under both main & org-fold feature)

2022-02-25 Thread Ihor Radchenko
Max Nikulin  writes:

> Thank you, Ihor. I am still not motivated enough to read whole page but 
> searching for "interval" (earlier I tried "overlay") resulted in the 
> following message:
>
> Message-ID:   <9206230917.aa16...@mole.gnu.ai.mit.edu>
> Date: Tue, 23 Jun 92 05:17:33 -0400
> From: r...@gnu.ai.mit.edu (Richard Stallman)
>
> describing tree balancing problem in GNU Emacs and linear search in lucid.
>
> Unfortunately there is no "id" or "name" anchors in the file suitable to 
> specify precise location. Even the link href is broken.

I think we have a misunderstanding here. That page does not contain much
of technical details. Rather a history. AFAIU, initially Emacs wanted to
implement balanced tree structure to store overlays, but the effort
stalled for a long time. Then, a company rolled out a simple list
storage causing a lot of contradiction related to FSF and a mojor Emacs
fork. At the end, the initial effort using balanced tree on GNU Emacs
side did not go anywhere and GNU Emacs eventually copied a simple list
approach that is backfiring now, when Org buffers actually do contain a
large numbers of overlays.

> Actually I suspect that markers may have a similar problem during regexp 
> searches. I am curious if it is possible to invoke a kind of "vacuum" 
> (in SQL parlance). Folding all headings and resetting refile cache does 
> not restore performance to the initial state at session startup. Maybe 
> it is effect of incremental searches.

I doubt that markers have anything to do with regexp search itself
(directly). They should only come into play when editing text in buffer,
where their performance is also O(N_markers).

Best,
Ihor




Re: profiling latency in large org-mode buffers (under both main & org-fold feature)

2022-02-25 Thread Max Nikulin

On 23/02/2022 23:35, Ihor Radchenko wrote:

Max Nikulin writes:


+;; the same purpose.  Overlays are implemented with O(n) complexity in
+;; Emacs (as for 2021-03-11).  It means that any attempt to move
+;; through hidden text in a file with many invisible overlays will
+;; require time scaling with the number of folded regions (the problem
+;; Overlays note of the manual warns about).  For curious, historical
+;; reasons why overlays are not efficient can be found in
+;; https://www.jwz.org/doc/lemacs.html.


The linked document consists of a lot of messages. Could you, please,
provide more specific location within the rather long page?


There is no specific location. That thread is an old drama unfolded when
intervals were first implemented by a third-party company (they were called
intervals that time). AFAIU, the fact that intervals are stored in a
list and suffer from O(N) complexity originates from that time. Just
history, as I pointed in the comment.


Thank you, Ihor. I am still not motivated enough to read whole page but 
searching for "interval" (earlier I tried "overlay") resulted in the 
following message:


Message-ID: <9206230917.aa16...@mole.gnu.ai.mit.edu>
Date:   Tue, 23 Jun 92 05:17:33 -0400
From:   r...@gnu.ai.mit.edu (Richard Stallman)

describing tree balancing problem in GNU Emacs and linear search in lucid.

Unfortunately there is no "id" or "name" anchors in the file suitable to 
specify precise location. Even the link href is broken.


Actually I suspect that markers may have a similar problem during regexp 
searches. I am curious if it is possible to invoke a kind of "vacuum" 
(in SQL parlance). Folding all headings and resetting refile cache does 
not restore performance to the initial state at session startup. Maybe 
it is effect of incremental searches.


Sorry, I have not tried patches for text properties instead of overlays.




Re: profiling latency in large org-mode buffers (under both main & org-fold feature)

2022-02-23 Thread Ihor Radchenko
Max Nikulin  writes:

>> +;; the same purpose.  Overlays are implemented with O(n) complexity in
>> +;; Emacs (as for 2021-03-11).  It means that any attempt to move
>> +;; through hidden text in a file with many invisible overlays will
>> +;; require time scaling with the number of folded regions (the problem
>> +;; Overlays note of the manual warns about).  For curious, historical
>> +;; reasons why overlays are not efficient can be found in
>> +;; https://www.jwz.org/doc/lemacs.html.
>
> The linked document consists of a lot of messages. Could you, please, 
> provide more specific location within the rather long page?

There is no specific location. That thread is an old drama unfolded when
intervals were first implemented by a third-party company (they were called
intervals that time). AFAIU, the fact that intervals are stored in a
list and suffer from O(N) complexity originates from that time. Just
history, as I pointed in the comment.

FYI, a more optimal overlay data structure implementation has been
attempted in feature/noverlay branch (for example, see
https://git.savannah.gnu.org/cgit/emacs.git/commit/?h=feature/noverlay=8d7bdfa3fca076b34aaf86548d3243bee11872ad).
But there is no activity on that branch for years.

Best,
Ihor




Re: profiling latency in large org-mode buffers (under both main & org-fold feature)

2022-02-23 Thread Max Nikulin

On 22/02/2022 12:33, Ihor Radchenko wrote:


I am wondering if many people in the list experience latency issues.


Ihor, it is unlikely the feedback that you would like to get concerning 
the following patch:


Ihor Radchenko. [PATCH 01/35] Add org-fold-core: new folding engine.
Sat, 29 Jan 2022 19:37:53 +0800. 
https://list.orgmode.org/74cd7fc06a4540b1d63d1e7f9f2542f83e1eaaae.1643454545.git.yanta...@gmail.com


but my question may be more appropriate in this thread. I noticed the 
following:



+;; the same purpose.  Overlays are implemented with O(n) complexity in
+;; Emacs (as for 2021-03-11).  It means that any attempt to move
+;; through hidden text in a file with many invisible overlays will
+;; require time scaling with the number of folded regions (the problem
+;; Overlays note of the manual warns about).  For curious, historical
+;; reasons why overlays are not efficient can be found in
+;; https://www.jwz.org/doc/lemacs.html.


The linked document consists of a lot of messages. Could you, please, 
provide more specific location within the rather long page?





Re: profiling latency in large org-mode buffers (under both main & org-fold feature)

2022-02-23 Thread Ihor Radchenko
Matt Price  writes:

>> Note that org-context is an obsolete function. Do you directly call it
>> in your config? Or do you use a third-party package calling org-context?
>>
>
> Hmm.  I don't see it anywhere in my ~.emacs.d/elpa~ directory or in my
> config file. I also went through ORG-NEWS and while it mentions that
> org-context-p has been removed, I can't find a deprecation notice about
> org-context.  I'm not quite sure what's going on. Will investigate further!

That notice itself is WIP :facepalm: Basically, org-context is not
reliable because is relies on fontification. See
https://orgmode.org/list/877depxyo9.fsf@localhost

Best,
Ihor



Re: profiling latency in large org-mode buffers (under both main & org-fold feature)

2022-02-23 Thread Matt Price
On Wed, Feb 23, 2022 at 12:22 AM Ihor Radchenko  wrote:

> Matt Price  writes:
>
> >>20128  80% - redisplay_internal (C function)
> >> 7142  28%  - assq
> >>  908   3%   - org-context
>
> Note that org-context is an obsolete function. Do you directly call it
> in your config? Or do you use a third-party package calling org-context?
>

Hmm.  I don't see it anywhere in my ~.emacs.d/elpa~ directory or in my
config file. I also went through ORG-NEWS and while it mentions that
org-context-p has been removed, I can't find a deprecation notice about
org-context.  I'm not quite sure what's going on. Will investigate further!

>
> Best,
> Ihor
>


Re: profiling latency in large org-mode buffers (under both main & org-fold feature)

2022-02-22 Thread Ihor Radchenko
Matt Price  writes:

> Yes, it definitely seems to be related tofile size, which makes me think
> that some kind of buffer parsing is the cause of the problem.

Parsing would show up in the profiler report in such scenario. It is not
the case though. The problem might be invisible text (it would cause
redisplay become slow), but 15k lines is relatively small - it should
not cause redisplay issues according to my experience. Just to be sure,
I would try to check performance in a completely unfolded buffer.

Best,
Ihor




Re: profiling latency in large org-mode buffers (under both main & org-fold feature)

2022-02-22 Thread Ihor Radchenko
Matt Price  writes:

>>20128  80% - redisplay_internal (C function)
>> 7142  28%  - assq
>>  908   3%   - org-context

Note that org-context is an obsolete function. Do you directly call it
in your config? Or do you use a third-party package calling org-context?

Best,
Ihor



Re: profiling latency in large org-mode buffers (under both main & org-fold feature)

2022-02-22 Thread Matt Price
sorry everyone, I accidentally sent this to Kaushal this morning,  and then
took quite a while to get back to a computer after he let me know my
mistake!

On Tue, Feb 22, 2022 at 10:12 AM Matt Price  wrote:

>
> On Tue, Feb 22, 2022 at 12:45 AM Kaushal Modi 
> wrote:
>
>>
>>
>> On Tue, Feb 22, 2022, 12:34 AM Ihor Radchenko  wrote:
>>
>>>
>>> I am wondering if many people in the list experience latency issues.
>>> Maybe we can organise an online meeting (jitsi or BBB) and collect the
>>> common causes/ do online interactive debugging?
>>>
>>
>> +1
>>
>> I have seen few people see this issue on the ox-hugo issue tracker:
>> https://github.com/kaushalmodi/ox-hugo/discussions/551#discussioncomment-2104352
>>
>
>
> I htink it's a great idea, Ihor!
>
> Meanwhile, I have a profile report. I had a little trouble getting the
> slowness to return (of course) but, subjectively, it seemed to get worse
> (subjectively slower, and the laptop fan started up b/c of high cpu usage)
> when I created and entered a src block. Apologies for the long paste:
>
>45707  70% - redisplay_internal (C function)
> 8468  13%  - substitute-command-keys
> 6111   9%   - #
>  943   1%- kill-buffer
>  708   1% - replace-buffer-in-windows
>  614   0%  - unrecord-window-buffer
>  515   0%   - assq-delete-all
>  142   0%  assoc-delete-all
>3   0% delete-char
> 8060  12%  - assq
> 2598   4%   - org-context
>   15   0%  org-inside-LaTeX-fragment-p
>   12   0%- org-in-src-block-p
>   12   0% - org-element-at-point
>9   0%  - org-element--cache-verify-element
>9   0% org-element--parse-to
>3   0%org-element--parse-to
>8   0%- org-at-timestamp-p
>8   0%   org-in-regexp
>  642   0%  + tab-bar-make-keymap
>  309   0%  + and
>  270   0%  + org-in-subtree-not-table-p
>  196   0%  + not
>  163   0%  + jit-lock-function
>  115   0%  + org-entry-get
>   96   0%keymap-canonicalize
>   56   0%org-at-table-p
>   52   0%  + #
>   48   0%  + #
>   43   0%table--row-column-insertion-point-p
>   29   0%org-inside-LaTeX-fragment-p
>   27   0%  + menu-bar-positive-p
>   26   0%  + eval
>   24   0%file-readable-p
>   21   0%  + funcall
>   16   0%  + imenu-update-menubar
>   14   0%  + vc-menu-map-filter
>   13   0%  + table--probe-cell
>   12   0%  + or
>   11   0%  + let
>   11   0%  + org-at-timestamp-p
>   10   0%  + flycheck-overlays-at
>7   0%undo-tree-update-menu-bar
>6   0%  + require
>6   0%  +
> emojify-update-visible-emojis-background-after-window-scroll
>6   0%kill-this-buffer-enabled-p
>4   0%mode-line-default-help-echo
>3   0%  + null
> 9192  14% - ...
> 9172  14%Automatic GC
>   20   0%  - kill-visual-line
>   20   0%   - kill-region
>   20   0%- filter-buffer-substring
>   20   0% - org-fold-core--buffer-substring-filter
>   20   0%  - buffer-substring--filter
>   20   0%   - #
>   20   0%- apply
>   20   0% - # F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_18>
>   20   0%  - #
>   20   0%   - apply
>   20   0%- #
>   20   0% - #
>   20   0%  - #
>   20   0%   - apply
>   20   0%- #
>   20   0% + delete-and-extract-region
> 7847  12% - command-execute
> 5749   8%  - funcall-interactively
> 2963   4%   + org-self-insert-command
> 2186   3%   + org-cycle
>  148   0%   + corfu-insert
>  146   0%   + execute-extended-command
>  121   0%   + org-return
>   32   0%   + #
>   26   0%   + #
>   24   0%   + mwim-beginning
>   19   0%   + org-delete-backward-char
>   19   0%   + org-kill-line
>9   0%   + #
>6   0%   + file-notify-handle-event
> 2095   3%  + byte-code
> 1359   2% + timer-event-handler
>  375   0% + org-appear--post-cmd
>  160   0% + corfu--post-command
>   61   0% + org-fragtog--post-cmd
>   14   0% + emojify-update-visible-emojis-background-after-command
>   11   0%   guide-key/close-guide-buffer
>7   0% + flycheck-perform-deferred-syntax-check
>7   0% + flycheck-maybe-display-error-at-point-soon
>6   0%   undo-auto--add-boundary
>6   0% + corfu--auto-post-command
>4   0%   flycheck-error-list-update-source
>  

Re: profiling latency in large org-mode buffers (under both main & org-fold feature)

2022-02-22 Thread Matt Price
Yes, it definitely seems to be related tofile size, which makes me think
that some kind of buffer parsing is the cause of the problem. I'll replay
in more detail to Ihor, down below!

On Mon, Feb 21, 2022 at 5:22 PM Samuel Wales  wrote:

> i have been dealing with latency also, often in undo-tree.  this might
> be a dumb suggestion, but is it related to org file size?  my files
> have not really grown /that/ much but maybe you could bisect one.  as
> opposed to config.
>
> i am not saying that your org files are too big.  just that maybe it
> could lead to insights.
>
>
> On 2/21/22, Matt Price  wrote:
> > I'm trying to figure out what causes high latency while typing in large
> > org-mode files.  The issue is very clearly a result of my large config
> > file, but I'm not sure how to track it down with any precision.
> >
> > My main literate config file is ~/.emacs.d/emacs-init.org, currently
> 15000
> > lines, 260 src blocks.
> > If I create a ~minimal.el~ config like this:
> >
> > (let* ((all-paths
> >   '("/home/matt/src/org-mode/emacs/site-lisp/org")))
> > (dolist (p all-paths)
> >   (add-to-list 'load-path p)))
> >
> >   (require 'org)
> >   (find-file "~/.emacs.d/emacs-init.org")
> >
> > then I do not notice any latency while typing.  If I run the profiler
> while
> > using the minimal config, the profile looks about like this at a high
> > level:
> >
> > 1397  71% - command-execute
> >  740  37%  - funcall-interactively
> >  718  36%   - org-self-insert-command
> >  686  34%+ org-element--cache-after-change
> >   10   0%+ org-fold-core--fix-folded-region
> >3   0%+ blink-paren-post-self-insert-function
> >2   0%+ jit-lock-after-change
> >1   0%
> > org-fold-check-before-invisible-edit--text-properties
> >9   0%   + previous-line
> >6   0%   + minibuffer-complete
> >3   0%   + org-return
> >3   0%   + execute-extended-command
> >  657  33%  - byte-code
> >  657  33%   - read-extended-command
> >   64   3%- completing-read-default
> >   14   0% + redisplay_internal (C function)
> >1   0% + timer-event-handler
> >  371  18% - redisplay_internal (C function)
> >  251  12%  + jit-lock-function
> >   90   4%  + assq
> >7   0%  + substitute-command-keys
> >3   0%  + eval
> >  125   6% + timer-event-handler
> >   69   3% + ...
> >
> > --
> > However, if I instead use my fairly extensive main config, latency is
> high
> > enough that there's a noticeable delay while typing ordinary words. I see
> > this  regardless of whether I build from main or from Ihor's org-fold
> > feature branch on github. The profiler overview here is pretty different
> --
> > redisplay_internal takes a much higher percentage of the CPU requirement:
> >
> >  3170  56% - redisplay_internal (C function)
> >  693  12%  - substitute-command-keys
> >  417   7%   + #
> >   59   1%  + assq
> >   49   0%  + org-in-subtree-not-table-p
> >   36   0%  + tab-bar-make-keymap
> >   35   0%and
> >   24   0%  + not
> >   16   0%org-at-table-p
> >   13   0%  + jit-lock-function
> >8   0%keymap-canonicalize
> >7   0%  + #
> >4   0%  + funcall
> >4   0%display-graphic-p
> >3   0%  + #
> >3   0%file-readable-p
> >3   0%  + table--probe-cell
> >3   0%table--row-column-insertion-point-p
> > 1486  26% - command-execute
> > 1200  21%  - byte-code
> > 1200  21%   - read-extended-command
> > 1200  21%- completing-read-default
> > 1200  21% - apply
> > 1200  21%  - vertico--advice
> >  475   8%   + #
> >
> > --
> > I've almost never used the profiler and am not quite sure how I should
> > proceed to debug this.  I realize I can comment out parts of the config
> one
> > at a time, but that is not so easy for me to do in my current setup, and
> I
> > suppose there are likely to be multiple contributing causes, which I may
> > not really notice except in the aggregate.
> >
> > If anyone has suggestions, I would love to hear them!
> >
> > Thanks,
> >
> > Matt
> >
>
>
> --
> The Kafka Pandemic
>
> A blog about science, health, human rights, and misopathy:
> https://thekafkapandemic.blogspot.com
>


Re: profiling latency in large org-mode buffers (under both main & org-fold feature)

2022-02-22 Thread Rudolf Adamkovič
Ihor Radchenko  writes:

> Samuel Wales  writes:
>
>> i have been dealing with latency also, often in undo-tree.  this might
>> be a dumb suggestion, but is it related to org file size?  my files
>> have not really grown /that/ much but maybe you could bisect one.  as
>> opposed to config.
>
> I am wondering if many people in the list experience latency issues.

FYI: I experience high latency when typing near in-text citations, such
as [cite:@ganz+2013].  It got so bad that I converted all my files to
hard-wrapped lines.  After I did that, the Org mode became usable again,
but it still lags visibly when typing near a citation.

Rudy
-- 
"'Contrariwise,' continued Tweedledee, 'if it was so, it might be; and
if it were so, it would be; but as it isn't, it ain't.  That's logic.'"
-- Lewis Carroll, Through the Looking Glass, 1871/1872

Rudolf Adamkovič  [he/him]
Studenohorská 25
84103 Bratislava
Slovakia



Re: profiling latency in large org-mode buffers (under both main & org-fold feature)

2022-02-21 Thread Kaushal Modi
On Tue, Feb 22, 2022, 12:34 AM Ihor Radchenko  wrote:

>
> I am wondering if many people in the list experience latency issues.
> Maybe we can organise an online meeting (jitsi or BBB) and collect the
> common causes/ do online interactive debugging?
>

+1

I have seen few people see this issue on the ox-hugo issue tracker:
https://github.com/kaushalmodi/ox-hugo/discussions/551#discussioncomment-2104352


Re: profiling latency in large org-mode buffers (under both main & org-fold feature)

2022-02-21 Thread Ihor Radchenko
Samuel Wales  writes:

> i have been dealing with latency also, often in undo-tree.  this might
> be a dumb suggestion, but is it related to org file size?  my files
> have not really grown /that/ much but maybe you could bisect one.  as
> opposed to config.

I am wondering if many people in the list experience latency issues.
Maybe we can organise an online meeting (jitsi or BBB) and collect the
common causes/ do online interactive debugging?

Best,
Ihor



Re: profiling latency in large org-mode buffers (under both main & org-fold feature)

2022-02-21 Thread Ihor Radchenko
Matt Price  writes:

> However, if I instead use my fairly extensive main config, latency is high
> enough that there's a noticeable delay while typing ordinary words. I see
> this  regardless of whether I build from main or from Ihor's org-fold
> feature branch on github. The profiler overview here is pretty different --
> redisplay_internal takes a much higher percentage of the CPU requirement:
>
>  3170  56% - redisplay_internal (C function)
> 
> 1200  21%- completing-read-default
> 1200  21% - apply
> 1200  21%  - vertico--advice
>  475   8%   + #

Judging from the profiler report, you did not collect enough number of
CPU samples. I recommend to keep the profiler running for at least 10-30
seconds when trying to profile typing latency. Also, note that running
M-x profiler-report second time will _not_ reproduce the previous
report, but instead show CPU profiler report between the last invocation
of profiler-report and the second one. I recommend to do the following:
1. M-x profiler-stop
2. M-x profiler-start
3. Do typing in the problematic Org file for 10-30 seconds
4. M-x profiler-report (once!)
5. Share the report here

> I've almost never used the profiler and am not quite sure how I should
> proceed to debug this.  I realize I can comment out parts of the config one
> at a time, but that is not so easy for me to do in my current setup, and I
> suppose there are likely to be multiple contributing causes, which I may
> not really notice except in the aggregate.

The above steps should be the first thing to try and they will likely
reveal the bottleneck. If not, you can go back to genetic bisection. I
do not recommend manual commenting/uncommenting parts of you large
config. Instead, you can try
https://github.com/Malabarba/elisp-bug-hunter. But only if CPU profiling
does not reveal anything useful.

Best,
Ihor



Re: profiling latency in large org-mode buffers (under both main & org-fold feature)

2022-02-21 Thread Samuel Wales
i have been dealing with latency also, often in undo-tree.  this might
be a dumb suggestion, but is it related to org file size?  my files
have not really grown /that/ much but maybe you could bisect one.  as
opposed to config.

i am not saying that your org files are too big.  just that maybe it
could lead to insights.


On 2/21/22, Matt Price  wrote:
> I'm trying to figure out what causes high latency while typing in large
> org-mode files.  The issue is very clearly a result of my large config
> file, but I'm not sure how to track it down with any precision.
>
> My main literate config file is ~/.emacs.d/emacs-init.org, currently 15000
> lines, 260 src blocks.
> If I create a ~minimal.el~ config like this:
>
> (let* ((all-paths
>   '("/home/matt/src/org-mode/emacs/site-lisp/org")))
> (dolist (p all-paths)
>   (add-to-list 'load-path p)))
>
>   (require 'org)
>   (find-file "~/.emacs.d/emacs-init.org")
>
> then I do not notice any latency while typing.  If I run the profiler while
> using the minimal config, the profile looks about like this at a high
> level:
>
> 1397  71% - command-execute
>  740  37%  - funcall-interactively
>  718  36%   - org-self-insert-command
>  686  34%+ org-element--cache-after-change
>   10   0%+ org-fold-core--fix-folded-region
>3   0%+ blink-paren-post-self-insert-function
>2   0%+ jit-lock-after-change
>1   0%
> org-fold-check-before-invisible-edit--text-properties
>9   0%   + previous-line
>6   0%   + minibuffer-complete
>3   0%   + org-return
>3   0%   + execute-extended-command
>  657  33%  - byte-code
>  657  33%   - read-extended-command
>   64   3%- completing-read-default
>   14   0% + redisplay_internal (C function)
>1   0% + timer-event-handler
>  371  18% - redisplay_internal (C function)
>  251  12%  + jit-lock-function
>   90   4%  + assq
>7   0%  + substitute-command-keys
>3   0%  + eval
>  125   6% + timer-event-handler
>   69   3% + ...
>
> --
> However, if I instead use my fairly extensive main config, latency is high
> enough that there's a noticeable delay while typing ordinary words. I see
> this  regardless of whether I build from main or from Ihor's org-fold
> feature branch on github. The profiler overview here is pretty different --
> redisplay_internal takes a much higher percentage of the CPU requirement:
>
>  3170  56% - redisplay_internal (C function)
>  693  12%  - substitute-command-keys
>  417   7%   + #
>   59   1%  + assq
>   49   0%  + org-in-subtree-not-table-p
>   36   0%  + tab-bar-make-keymap
>   35   0%and
>   24   0%  + not
>   16   0%org-at-table-p
>   13   0%  + jit-lock-function
>8   0%keymap-canonicalize
>7   0%  + #
>4   0%  + funcall
>4   0%display-graphic-p
>3   0%  + #
>3   0%file-readable-p
>3   0%  + table--probe-cell
>3   0%table--row-column-insertion-point-p
> 1486  26% - command-execute
> 1200  21%  - byte-code
> 1200  21%   - read-extended-command
> 1200  21%- completing-read-default
> 1200  21% - apply
> 1200  21%  - vertico--advice
>  475   8%   + #
>
> --
> I've almost never used the profiler and am not quite sure how I should
> proceed to debug this.  I realize I can comment out parts of the config one
> at a time, but that is not so easy for me to do in my current setup, and I
> suppose there are likely to be multiple contributing causes, which I may
> not really notice except in the aggregate.
>
> If anyone has suggestions, I would love to hear them!
>
> Thanks,
>
> Matt
>


-- 
The Kafka Pandemic

A blog about science, health, human rights, and misopathy:
https://thekafkapandemic.blogspot.com