Hi David,

Feel free to send me a full example offlist of what you're attempting to do here and I'll see if I can help.

Andy

On 03/04/2012 16:00, David G Ortega wrote:
Hi Andy, thanks for the reply,

good luck with that!! seems really nasty to me, first of all just
because I think that you have to guess the charset of the response
just to override it... At the minute I'm using ICU and in openBD all
strings are detected as ISO-8559-1 It does not really matter if I set
the encoding to UTF-8 or use the cfprocessing directive like this

<cfscript>  SetEncoding("form","utf-8"); SetEncoding("url","utf-8");</
cfscript>
<cfprocessingdirective pageencoding="utf-8">

here is the code:

private function icuMatch(text)
        {
                var icuDetector = createObject('java',
'com.ibm.icu.text.CharsetDetector').init();
                        icuDetector.setText(createObject('java',
'java.io.StringBufferInputStream').init(text));

                return icuDetector.detect();
        }


thats why in my example when Im passing the ad to elasticsearch the
encoding shows "encoding":"ISO-8859-1","language":"es","tokens":
when in reality is being posted with UTF-8.

At least with spanish, german, italian and so on is working but with
languages like russian or chinese is creating a complete mess creating
double encoded strings???

I'm very new to CF and openBD but In railo everything was always
converted into \u characters...



--
online documentation: http://openbd.org/manual/
  google+ hints/tips: https://plus.google.com/115990347459711259462
    http://groups.google.com/group/openbd?hl=en

Reply via email to