I would suggest avoiding exceptions and continuations and have a separate
parameter[*] that holds the current unexpected character handler. You'll
still have to figure out what kind of thing it returns (void, or a
replacement character, or a new input port?), but you have to do that
anyway.

The handler can raise an ordinary, non-continuable exception if it can't
repair the problem. Or it could do weird continuation stuff like capture
the part of the continuation since the handler was installed and apply it
to multiple characters to see if one works (but that would not work well
with stateful objects like input ports).

Ryan

[*] or argument, if that turns out to fit your code better


On Tue, Dec 29, 2020 at 11:27 AM je...@lisp.sh <je...@lisp.sh> wrote:

> I'm working on a tokenizer for that involves some level of "self-healing":
> if the input port contains an unexpected character, an error token is to be
> emitted, but computation isn't over -- we are to continue by substituting
> the bad character with a good one, emitting that, too (thereby recovering),
> and keep rolling.
>
> When the tokenizer encounters an unexpected character, what I'd like to do
> is similar to raising/throwing an exception, but what's going on isn't (in
> my understanding) quite the same as raising an exception, because I don't
> want to just have some ambient exception handler deal with the error, since
> my understanding is that exceptions don't have enough information to simply
> resume the computation. I guess I'd like to install a prompt, but when
> control goes back to the prompt, I'd like it to be possible to resume the
> computation.
>
> An old discussion here from 2015, with John Carmack ("continuing after a
> user break"), comes close to what I have in mind. It's about setting up a
> handler for breaks. I guess what I have in mind are breaks, but what's not
> clear to me is how to raise them in my code. It seems like `break-thread`
> is the only way to do that? The discussion of breaks in the docs involves a
> lot of talk about threads and user interaction. But what I'm doing isn't
> really about threads at all (I'm making a tokenizer, not a REPL or other
> interactive program), so it leaves me feeling uncertain that I want breaks
> after all. And even if breaks are a technically viable solution, I wonder
> if there are any performance penalties or other gotchas that could be
> avoided by using some other continuation forms.
>
> Here's some pseudo-Racket that gets at what I'm looking for:
>
> ; compute a list of tokens
> ; -> (listof (char? or eof))
> (define (handle-many)
> (watch-for-break (lambda (e) (handle-break e))
> (match (handle-one)
> [(? eof-object?) '(eof)]
> [(? char? c) (cons c (handle-one))])))
>
> ; -> char? or eof
> (define (handle-one)
> (define c (peek-char))
> (match c
> [(? eof-object?) eof]
> [#\f
> (break-with c) ; pass the unexpected character to the caller...
> (read-char in) ; ...but after that, resume here!
> #\a]
> [else
> (read-char in)
> #\b]))
>
> Any suggestions?
>
> Thanks,
>
> Jesse
>
> --
> You received this message because you are subscribed to the Google Groups
> "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to racket-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/racket-users/e455208d-0ac9-41d6-ad08-dd3d08e12baan%40googlegroups.com
> <https://groups.google.com/d/msgid/racket-users/e455208d-0ac9-41d6-ad08-dd3d08e12baan%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/racket-users/CANy33qmgzGafKVDXp%3Do9Kqb5nOJ11Ys9K%3D9Ay%2BH24YgxaWddaA%40mail.gmail.com.

Reply via email to