On 02/28/13 15:20, Jacob Carlborg wrote: > On 2013-02-28 15:08, Artur Skawina wrote: > >> Having said that, I've used this approach in a D lexer, and it does not >> really >> matter in practice - avoiding the length (or '\0' sentinel) check makes a >> <~1ms difference when lexing "datetime.d" sized objects (1.5Mbytes+, 460k+ >> tokens). >> Which is practically irrelevant both in an IDE context and a compiler context >> - other processing will be be orders of magnitude more expensive. An IDE >> doesn't >> need to re-lex the whole file after every key press and 1ms won't make any >> difference for a compiler run. > > It's not about lexing a single file like std.datetime. We're takling be able > to fast lex, I don't know, 100 or 1000 of files like std.datetime.
Define "fast". Lexing std.datetime takes at most ~10-20ms (possibly a single-digit ms number, but i'd need to write some code to check the actual number). Smaller objects take proportionally less. Meaning you'll be I/O bound, even /one/ (disk) cache miss will have more impact then these kind of optimizations. Lexing a hundred small files or one 100x as big file is basically the same operation; the difference will be in I/O + setup/teardown costs, which will be /outside/ the lexer, so aren't affected by how it accesses input. artur
