OK, I think I got this figured out and working, finally.
What I did is this. After file upload, I opened the uploaded file,
decoded it and re-encoded, but using my own auto-detect routine rather
than gluon's decoder routine. This worked!
I'm curious why gluon's autodetector doesn't work. At first
inspection, I saw this:
autodetect_dict={ # bytepattern : ("name",
(0x00, 0x00, 0xFE, 0xFF) : ("ucs4_be"),
(0xFF, 0xFE, 0x00, 0x00) : ("ucs4_le"),
(0xFE, 0xFF, None, None) : ("utf_16_be"),
(0xFF, 0xFE, None, None) : ("utf_16_le"),
(0x00, 0x3C, 0x00, 0x3F) : ("utf_16_be"),
(0x3C, 0x00, 0x3F, 0x00) : ("utf_16_le"),
(0x3C, 0x3F, 0x78, 0x6D): ("utf_8"),
(0x4C, 0x6F, 0xA7, 0x94): ("EBCDIC")
}
It looks as if utf_16_be and utf_16_le byte patterns are defined
twice. That can't be right, can it? Nevertheless, the code above
shouldn't be a problem in and of itself.
Anyway, I would be happy to contribute my autodetector to the web2py
codebase if it helps.