https://bz.apache.org/bugzilla/show_bug.cgi?id=64443

            Bug ID: 64443
           Summary: POSTing form data through proxy_html with different
                    frontend / backend charsets
           Product: Apache httpd-2
           Version: 2.4.43
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: mod_proxy_html
          Assignee: [email protected]
          Reporter: [email protected]
  Target Milestone: ---

Per design and by default, proxy_html will translate HTML content into UTF-8
regardless of the backend charset. This is fine, since UTF-8 has wide browser
support as far as I know.

In that scenario, the browser will encode POSTed form data in UTF-8, but that
may not match the backend charset when proxy_html re-submits the form content
upstream. E.g.:

GET:  Client <--(UTF-8)--- proxy_html <--(ISO-8859-1)--- Backend
POST: Client ---(UTF-8)--> proxy_html ---(UTF-8)-------> Backend
                                     (encoding mismatch!)

A simple workaround is to specify the backend charset by adding an
accept-charset attribute to HTML <form> tags. That attribute isn't usually
needed, as form enconding usually matches that of the HTML document; so -I
guess- it's rarely used. When moving a site from direct to proxied publishing,
that means the whole site would need to be checked and recoded to add that
accept-charset attribute to every <form>. 

As proxy_html deals automatically with different fronted / backend charsets in
downstream content, maybe it would be expected to do the same with upstream
POSTed form data. Maybe a "stateful" approach to it (i.e. proxy_html keeping
track of every form translated downstream that should be reverse-translated
when posted upstream) isn't convenient or even feasible. In my very humble
opinion (with no knowledge of the internals of it) maybe a simpler solution
could be having that accept-charset attribute added automatically by proxy_html
when translating HTML forms. As per the docs, proxy_html's mission is just to
"rewrite HTML links in a proxy situation", but maybe it could be more widely
scoped to make HTML content coherent accross an Apache HTTP proxy.

Thank you in advance. Best regards,

Antonio

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to