On 07/07/2006 15:04, Stipe Tolj wrote:
Thanos Chatziathanassiou wrote:

Stipe Tolj wrote:

Thanos Chatziathanassiou wrote:

<patch sniped>


this patch is "works" if you punch in the greek values of GSM 03.38 alphabet in there. But remember(!), they are not part of ISO-8859-1 (latin1), they are part of latin7. So this patch is a kludge.

Yes, it is...but then again so is GSM 03.38 :)

We obviously "loose" characters (those marked with '?') when mapping from GSM default alphabet to latin1. We should use UTF-8 (unicode) instead?!

We could, but we'd be giving up on 160-chars-long SMS. Besides, some (admittedly old) devices don't handle utf8 all too well...

If I understand correctly, your problem is that this isn't ``gsm_to_latin1'' any more. Maybe we could translate to/from iso-8859-7 for these chars. i.e. 0x10 would become 0xC4 which is greek capital delta on iso-8859-7. Thing is, my end application (and probably Kyriakos' too) was already prepared to handle this, so I simply didn't bother. If we're talking about integrating it properly to kannel, I think that's the best way to go. Maybe even have it as a configuration option. That would only leave a problem with the Euro sign (double-byte 0x1B 0x65).

Thoughts ?

[I'll move this thread to devel@ list, since it's more appropriate there]

yep, correct. This "should" be configurable. Actually we do may gsm_to_latin1() already _inside_ the smsc modules. Which seems "wrong" to me. We should exit the smsc-specific layer with GSM default charset, and allow the user to change the exit charset encoding in the abstaction layer. Which would mean this has to be done only in one global place for all smsc modules.

Alex, your thoughts on this?
Others?

Stipe


This was my expected behavior, that is I expected kannel to pass me the characters as they came (GSM ascii / UTF-8 etc) and then my application to convert to whatever it requires and in essence that is what Thanos patch does. My application had the code to do this already in place, and confused the hell out of me when I only got question marks. This seems the most sensible approach and the only other one I can see as working globally, is default converting to UTF-8 of all incoming (from smsc to kannel) messages before handing them to an external application/module. For me it would seem simpler to leave the recoding of GSM ascii to each users external application rather than in kannel, but I guess having a configurable kannel module to do it would be handy for those folks that do their keyword matching entirely inside kannel and would like to match non latin1 characters as well.


--

Kyriacos Sakkas
Development Team
Netsmart
Tel: + 357 22 452565
Fax: + 357 22 452566
Email: [EMAIL PROTECTED]
http://www.netsmart.com.cy

Taking Business to a New Level!

** Confidentiality Notice: The information contained in this email message may be privileged, confidential and protected from disclosure. If you are not the intended recipient, any dissemination, distribution, or copying of this email message is strictly prohibited. If you think that you have received this email message in error, please email the sender at [EMAIL PROTECTED] **

Reply via email to