Thanks for the detailed report. I made some changes.

* The exported symbols come from the EXT package. They are

character-coding-error
character-coding-error-external-format
character-decoding-error
character-decoding-error-octets
character-encoding-error
character-encoding-error-code
stream-encoding-error
stream-decoding-error

* Two restarts are provided USE-VALUE and CONTINUE. They can be used via the
ANSI functions with the same name (I think you missed that point regarding
USE-VALUE)

* Encoding errors are also now created. Before the function had not been
plugged into the engine.

* I am not likely to provide multi-character restarts for a simple reason:
ECL's streams are too simple, not providing arbitrary push-back buffers for
bytes. Having a USE-VALUE restart that returns more than one character may
lead to unexpected problems with unread-char and other functions -- I do not
mean it is impossible but it simply complicates the interface and right now
I have no clear idea how to do that.

I attached a modified version of your code.

Best,

Juanjo

-- 
Instituto de FĂ­sica Fundamental, CSIC
c/ Serrano, 113b, Madrid 28006 (Spain)
http://juanjose.garciaripoll.googlepages.com
(defun custom-read-line (stream &key (max 512) (replace-char))
  (let ((line (make-array max
                          :element-type 'character
                          :adjustable t
                          :fill-pointer 0)))
    (flet ((add-char (c)
             (declare (type character c))
             (vector-push c line))
           (finalize-line ()
             (let ((len (length line)))
               (when (and (> len 0)
                          (char= #\Return (aref line (1- len))))
                 (vector-pop line)))
             line))
      (loop
          do
           (let (
                 ;; No way to determine invalid octet values with old ECL,
                 ;; Return an unknown character code
                 (c #+old-ecl(handler-case
                             (read-char stream)
                           (simple-error ()
                             #\UFFFD))
                    ;; SBCL provides invalid octets which we can import and
                    ;; then issue an ATTEMPT-RESYNC restart to resume
                    #+sbcl(handler-bind
                              ((sb-int:stream-decoding-error
                                #'(lambda (e)
                                    ;; Treat invalid UTF-8 octets as
                                    ;; ISO-8859 characters.
                                    (mapcar #'(lambda (c)
                                                (when (> c 127)
                                                  (add-char (code-char c))))
                                            
(sb-int:character-decoding-error-octets e))
                                    (invoke-restart 'sb-int:attempt-resync))))
                            (read-char stream))
                    ;; Test with new ECL
                    #+ecl(handler-bind
                             ((ext:character-decoding-error ; Internal
                               #'(lambda (e)
                                   (mapcar #'(lambda (c)
                                               (format t "~%Code: ~A" c)
                                               (when (> c 127)
                                                 ;; Never happens
                                                 (add-char (code-char c))))
                                           ;; Not advertized interface?
                                           (ext:character-decoding-error-octets 
e))
                                   ;; Either replace the character or ignore
                                   (if replace-char
                                       (use-value #\?)
                                       (continue))
                                   )))
                           (read-char stream)))
                 )
             (when (char= #\Newline c)
               (return (values (finalize-line) t)))
             (add-char c))))))

(defun test (&rest args)
  (with-open-file (stream "InvalidUTF8.txt")
    (loop
       do
         (let ((line (handler-case
                         (apply #'custom-read-line stream args)
                       (end-of-file ()
                         (loop-finish)))))
           (format t "~A~%" line)))))

(test)
#+ecl
(test :replace-char #\?)
------------------------------------------------------------------------------
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
_______________________________________________
Ecls-list mailing list
Ecls-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ecls-list

Reply via email to