> Id steer well clear of client side Regexs for something "continuously
> checked" like this.
I did precisely because I knew it would only end badly.
> Why not just remember the last few letters typed?
If only that sort of information was available.
The API for this sort of thing is the EditorUpdateEvent, which can
tell you that a) a change happened and b) the location in the document
it happened at.
This then requires a loop over the document to stitch together the
text up to that point, before it can even be considered for checking.
Since the EditorUpdateEvent is rate-limited it is possible for quite
large deltas to be applied between calls resulting in large quantities
of text needing to be checked.
> If its "http:" then
This is fine until we want to check for more than just http. E.g.
https or mailto or (any other scheme...). At which stage, we need to
(in the worst case) sample the number of letters in the longest
scheme. And then a loop over the list of known schemes for each bit we
need to check, so it ends up testing things like ('h', 'ht', 'htt',
'http'). Also, we need to ensure we get 'https' without treating it as
'http'. By instead working backwards along the text from the detection
of the URI separator ('://') instead.
To work backwards, we need to maintain a copy of enough text in the
document prior to the separator to ensure we always detect correctly
(even across node boundaries, but managing to ignore/reset state on
new lines).
> set a boolean "probablylink" too true. Then the next time space is
> pressed, check inbetween those letters too see if it looks like a
> link.
Since we are using EditorUpdateEvent, checking for a space will
require a loop to re-retrieve the document text up-to the latest
change, then looking through all the text upto the new location but
beyond the location we set 'probablylink' at, and looking for a space
(another loop needed then).
> That should go at a decent speed during entry surely?
Oh, and this needs to be able to be done within a few milliseconds
otherwise the user will get annoyed with the lag.
By this stage, the number of loops we need to do every time we receive
an EditorUpdateEvent is such that it becomes noticeable after we have
a few different links all within a few hundred characters.
NB: The algorithm you described is pretty much what I had coded it to
try doing. So if you want pull down my code from that review request
and feel free to try to improve its efficiency.
Network latencies could easily be ~1s, so yes the link would be
annotated a bit slower, but the user would be able to continue to type
the blip at normal speed, rather than being slowed down by the browser
unable to run the JS checks quick enough each time a keypress occurs
causing EditorUpdateEvent to fire (since the lag time due to the
browser slow-down is now great enough to result in an event being
fired every keypress).