Hi Gustaf,

I spotted that *ns_getform *takes a charset argument from looking at the
source code.
The options for overriding charsets  at the moment seem to be:


*ns_getform iso8859-1*

*ns_urlcharset iso8859-1*

*ns_getform *


*ns_conn urlencoding iso8859-1*
*ns_getform *

We experimented with some code which tried to trap errors from *ns_getform*,
and where the error was due to "invalid UTF-8", try a fallback charset.
All 3 of the above techniques worked OK when the Content-Type header leaves
the charset *unspecified*.

The main issues we had were:

1. When a *charset=utf-8* is present in the *Content-Type* header, this
overrides ([1]) any encoding we pass with using the 3 techniques above.
In those cases we have to manipulate the headers' ns_set to remove or
change the charset.
eg.
*Content-Type: application/x-www-form-urlencoded; charset=utf-8*
transform to ->
*Content-Type: application/x-www-form-urlencoded*
or
*Content-Type: application/x-www-form-urlencoded; charset=windows-1252*

2. Trapping the specific "invalid UTF-8" error - this method seems fragile
- would be nice if there was an *errorCode *we would trap.

*::try {*
*    ns_getform*
*} on error {msg options} {*
*    if { [string match "*contains invalid UTF-8" $msg] } {*
*        # change Content_type charset (if present)*
*        # try fallback charset*
*    } else {*
*        # rethrow error*
*    }*
*}*

But I think this presents us with a way forward in cases where client apps
are not getting the encoding correct.

[1]
https://bitbucket.org/naviserver/naviserver/annotate/master/nsd/form.c?at=master#form.c-170
_______________________________________________
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel

Reply via email to