[PATCH] invalid characters cause python3 unicode errors

lilydjwg Tue, 21 Jun 2011 00:13:01 -0700

Hi,

The previous python3 patch fixed Vim's python3 unicode errors when there
are no errors. But when there are invalid characters in the buffer,
exceptions may be raised by python, but vim only prints a warning in
that situation. And in python 3.1 and later, invalid characters from
file names and environment variables are handled using an error handler
called 'surrogateescape'.[1] I think Vim should do like this and don't
break the gundo plugin I'm using.


To see the errors, open the attached 'test' file in UTF-8 encoding
(vim should say there are invalid characters) and try execute:

  py3 import vim
  py3 vim.current.buffer.append(vim.current.buffer[0])

With the patch applied, there will be no errors and two same line.

Anyway, when outputing invalid characters, unicode errors are still
raised. This is the same behaviour with the interpreter.

[1]: http://www.python.org/dev/peps/pep-0383/

-- 
Best regards,
lilydjwg

-- 
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

����

diff --git a/src/if_python3.c b/src/if_python3.c
index 5f02a46..e9b5ec7 100644
--- a/src/if_python3.c
+++ b/src/if_python3.c
@@ -68,9 +68,16 @@
 
 static void init_structs(void);
 
+/* The "surrogateescape" error handler is new in Python 3.1 */
+#if PY_VERSION_HEX >= 0x030100f0
+# define CODEC_ERROR_HANDLER "surrogateescape"
+#else
+# define CODEC_ERROR_HANDLER NULL
+#endif
+
 #define PyInt Py_ssize_t
 #define PyString_Check(obj) PyUnicode_Check(obj)
-#define PyString_AsBytes(obj) PyUnicode_AsEncodedString(obj, (char *)p_enc, 
NULL);
+#define PyString_AsBytes(obj) PyUnicode_AsEncodedString(obj, (char *)p_enc, 
CODEC_ERROR_HANDLER);
 #define PyString_FreeBytes(obj) Py_XDECREF(bytes)
 #define PyString_AsString(obj) PyBytes_AsString(obj)
 #define PyString_Size(obj) PyBytes_GET_SIZE(bytes)
@@ -661,8 +668,8 @@ DoPy3Command(exarg_T *eap, const char *cmd)
 
     /* PyRun_SimpleString expects a UTF-8 string. Wrong encoding may cause
      * SyntaxError (unicode error). */
-    cmdstr = PyUnicode_Decode(cmd, strlen(cmd), (char *)p_enc, NULL);
-    cmdbytes = PyUnicode_AsEncodedString(cmdstr, "utf-8", NULL);
+    cmdstr = PyUnicode_Decode(cmd, strlen(cmd), (char *)p_enc, 
CODEC_ERROR_HANDLER);
+    cmdbytes = PyUnicode_AsEncodedString(cmdstr, "utf-8", CODEC_ERROR_HANDLER);
     Py_XDECREF(cmdstr);
     PyRun_SimpleString(PyBytes_AsString(cmdbytes));
     Py_XDECREF(cmdbytes);
@@ -1463,7 +1470,7 @@ LineToString(const char *str)
     }
     *p = '\0';
 
-    result = PyUnicode_Decode(tmp, len, (char *)p_enc, NULL);
+    result = PyUnicode_Decode(tmp, len, (char *)p_enc, CODEC_ERROR_HANDLER);
 
     vim_free(tmp);
     return result;

[PATCH] invalid characters cause python3 unicode errors

Raspunde prin e-mail lui