On 07/07/2006 15:04, Stipe Tolj wrote:
Thanos Chatziathanassiou wrote:
Stipe Tolj wrote:
Thanos Chatziathanassiou wrote:
<patch sniped>
this patch is "works" if you punch in the greek values of GSM 03.38
alphabet in there. But remember(!), they are not part of ISO-8859-1
(latin1), they are part of latin7. So this patch is a kludge.
Yes, it is...but then again so is GSM 03.38 :)
We obviously "loose" characters (those marked with '?') when mapping
from GSM default alphabet to latin1. We should use UTF-8 (unicode)
instead?!
We could, but we'd be giving up on 160-chars-long SMS. Besides, some
(admittedly old) devices don't handle utf8 all too well...
If I understand correctly, your problem is that this isn't
``gsm_to_latin1'' any more. Maybe we could translate to/from
iso-8859-7 for these chars. i.e. 0x10 would become 0xC4 which is
greek capital delta on iso-8859-7.
Thing is, my end application (and probably Kyriakos' too) was already
prepared to handle this, so I simply didn't bother. If we're talking
about integrating it properly to kannel, I think that's the best way
to go. Maybe even have it as a configuration option.
That would only leave a problem with the Euro sign (double-byte 0x1B
0x65).
Thoughts ?
[I'll move this thread to devel@ list, since it's more appropriate there]
yep, correct. This "should" be configurable. Actually we do may
gsm_to_latin1() already _inside_ the smsc modules. Which seems "wrong"
to me. We should exit the smsc-specific layer with GSM default
charset, and allow the user to change the exit charset encoding in the
abstaction layer. Which would mean this has to be done only in one
global place for all smsc modules.
Alex, your thoughts on this?
Others?
Stipe
This was my expected behavior, that is I expected kannel to pass me the
characters as they came (GSM ascii / UTF-8 etc) and then my application
to convert to whatever it requires and in essence that is what Thanos
patch does. My application had the code to do this already in place, and
confused the hell out of me when I only got question marks.
This seems the most sensible approach and the only other one I can see
as working globally, is default converting to UTF-8 of all incoming
(from smsc to kannel) messages before handing them to an external
application/module.
For me it would seem simpler to leave the recoding of GSM ascii to each
users external application rather than in kannel, but I guess having a
configurable kannel module to do it would be handy for those folks that
do their keyword matching entirely inside kannel and would like to match
non latin1 characters as well.
--
Kyriacos Sakkas
Development Team
Netsmart
Tel: + 357 22 452565
Fax: + 357 22 452566
Email: [EMAIL PROTECTED]
http://www.netsmart.com.cy
Taking Business to a New Level!
** Confidentiality Notice: The information contained in this email
message may be privileged, confidential and protected from disclosure.
If you are not the intended recipient, any dissemination, distribution,
or copying of this email message is strictly prohibited.
If you think that you have received this email message in error, please
email the sender at [EMAIL PROTECTED] **