Re: Use SIMD to accelerate comment lexing

Walter Bright via Digitalmars-d Thu, 04 Jun 2015 17:35:42 -0700

On 6/4/2015 2:44 PM, deadalnix wrote:

On Thursday, 4 June 2015 at 18:39:02 UTC, Walter Bright wrote:

On 6/3/2015 7:05 PM, deadalnix wrote:

On Wednesday, 3 June 2015 at 22:50:52 UTC, Walter Bright wrote:

On 6/2/2015 5:45 PM, deadalnix wrote:

You go though character and look for a '/'. When you hit one, you check if the
character before it is a *, and if so, you have the end of the comment.
There is
obviously various edges cases to take into account, but that is the general
idea.

Line numbers have to be kept track of as well.


They retrieve line number lazily when needed, with various mechanism to speedup
the lookup.


Hmm. There's no way to get the line number without counting LFs, and that
means searching for them.


Yes, the first time you query file number, clang build metadata about new line
by going through the file's content and finding position of new lines. The
process uses vector operation as well.

Apparently, they think it is better to do that way for various reasons:
  - Position tracking is more compact (and position is embedded in all
expression, declaration, and more) which reduce memory footprint bu quite a lot.
  - It makes the lexer simpler and faster.
  - You don't need to track new lines if you don't use them. If you don't emit
debug infos in C++, and have no error, most line number are not used (not sure
in D, because various language facilities like bound checking uses line number,
but that is a win in C++).
  - Debug emission have some predictable access pattern, and algorithm to find
line number from an offset in the file are special cased to handle it.
  - Finding new line can be vectorized on the whole file. t cannot be vectorized
when done in // with lexing.

Once again, I'm not sure this is a win in D, because we need line number more
than in C++, but it seems to be a win in C++.

It's an interesting approach. I generally shoot for making the debug builds thefastest, because that's when people are in the edit-compile-debug loop. And thedebug output needs line numbers :-)

Re: Use SIMD to accelerate comment lexing

Reply via email to