Re: [twsocket] Found a bug and made a fix in function UrlDecode

2010-08-09 Thread Bjørnar Nielsen
Arno,

 ... try the following code and let me know how it works for you, 

The code works for me. But should not  the first overload directive be inside 
the conditional define?

Regards Bjørnar
--
To unsubscribe or change your settings for TWSocket mailing list
please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be

Re: [twsocket] Found a bug and made a fix in function UrlDecode

2010-08-09 Thread Arno Garrels
Bjørnar Nielsen wrote:
 Arno,
 
 ... try the following code and let me know how it works for you,
 
 The code works for me. But should not  the first overload directive
 be inside the conditional define? 

Yes, it's better inside, corrected. Just updated the svn repository.

-- 
Arno Garrels

--
To unsubscribe or change your settings for TWSocket mailing list
please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be

Re: [twsocket] Found a bug and made a fix in function UrlDecode

2010-08-06 Thread Bjørnar Nielsen
Arno,

 Would you expect a correct result as well if you base64-decoded a 
 quoted-printable encoded string?

No, I agree. But the problem is not the decoding itself but the way 
Unicode-chars and Ansi-chars are treated.
This is the line where the problem lies:
MyAnsichar := AnsiChar(UnicodeUrl[I]);
If UnicodeUrl is switched to AnsiString, the problem disappears.

 Can't you use your own custom function then?

Yes I can, but I think other users could benefit from my proposed change. I 
think this is a problem that was introduced with porting to Builder 2010 and 
using UnicodeString. This problem was not there before and maybe other users 
also have this problem now without knowing it.

Why not make the changes I proposed when all it does is restoring the function 
to old behavior as when only AnsiString was used?

Regards Bjørnar
--
To unsubscribe or change your settings for TWSocket mailing list
please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be

Re: [twsocket] Found a bug and made a fix in function UrlDecode

2010-08-06 Thread Arno Garrels
Bjørnar Nielsen wrote:
 Arno,
 
 Would you expect a correct result as well if you base64-decoded a
 quoted-printable encoded string? 
 
 No, I agree. But the problem is not the decoding itself but the way
 Unicode-chars and Ansi-chars are treated. 
 This is the line where the problem lies:
 MyAnsichar := AnsiChar(UnicodeUrl[I]);

Yes, it expects a Char containing a 7-bit printable ASCII character.

 If UnicodeUrl is switched to AnsiString, the problem disappears.

This would introduce plenty of implicit string casts in existing 
Delphi code because in Delphi an ICS-URL is of type string,
only the mapping of string changed in D2009+ from AnsiString
to UnicodeString. This is different in C++ Builder where generated
.hpp files export the mapped types (AnsiString and UnicodeString
explicitly). Note that such introduced string casts would corrupt 
invalid URLs *as well*. 
 
 Can't you use your own custom function then?
 
 Yes I can, but I think other users could benefit from my proposed
 change. I think this is a problem that was introduced with porting to
 Builder 2010 and using UnicodeString. This problem was not there
 before and maybe other users also have this problem now without
 knowing it.
 
 Why not make the changes I proposed when all it does is restoring the
 function to old behavior as when only AnsiString was used? 

The only workaround that comes to my mind was another overload 
that takes a RawByteString instead of string. I won't use AnsiString
because implicit ansi string casts must be avoided too.

-- 
Arno Garrels
--
To unsubscribe or change your settings for TWSocket mailing list
please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be

Re: [twsocket] Found a bug and made a fix in function UrlDecode

2010-08-06 Thread Bjørnar Nielsen
Arno,

 The only workaround that comes to my mind was another overload that takes 
 a RawByteString instead of string. I won't use AnsiString because implicit 
 ansi string
  casts must be avoided too.

That would work for me. I'm not very familiar with the use of RawByteString, 
but I made version of the function that works for me, do you think this version 
would work for others too (I just testet in 2010 C++ builder):

Regards Bjørnar

Code follows (only change from previous version I sent is type-change of first 
in-param):

function UrlDecode(const S : RawByteString ; SrcCodePage: Cardinal = CP_ACP;
  DetectUtf8: Boolean = TRUE) : UnicodeString ;
var
I, J, L : Integer;
U8Str   : AnsiString;
Ch  : AnsiChar;
begin
L := Length(S);
SetLength(U8Str, L);
I := 1;
J := 0;
while (I = L) and (S[I]  '') do begin
Ch := AnsiChar(S[I]);
if Ch = '%' then begin
Ch := AnsiChar(htoi2(PAnsiChar(@S[I + 1])));
Inc(I, 2);
end
else if Ch = '+' then
Ch := ' ';
Inc(J);
U8Str[J] := Ch;
Inc(I);
end;
SetLength(U8Str, J);
if (SrcCodePage = CP_UTF8) or (DetectUtf8 and IsUtf8Valid(U8Str)) then
{$IFDEF COMPILER12_UP}
Result := Utf8ToStringW(U8Str)
else
Result := AnsiToUnicode(U8Str, SrcCodePage);
{$ELSE}
Result := Utf8ToStringA(U8Str)
else
Result := U8Str;
{$ENDIF}
end;
--
To unsubscribe or change your settings for TWSocket mailing list
please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be

Re: [twsocket] Found a bug and made a fix in function UrlDecode

2010-08-06 Thread Arno Garrels
Bjørnar,

 Arno,
 
 The only workaround that comes to my mind was another overload that
 takes 
 a RawByteString instead of string. I won't use AnsiString because
  implicit ansi string casts must be avoided too.
 
 That would work for me. I'm not very familiar with the use of
 RawByteString, but I made version of the function that works for me,
 do you think this version would work for others too (I just testet in
 2010 C++ builder): 

The new overload is only required in RDS 2009+.
It has to be conditional compiled for other reasons too, please try the 
following code and let me know how it works for you, first declaration
just got the overload directive:

function  UrlDecode(const S : String;
SrcCodePage : LongWord = CP_ACP;
DetectUtf8  : Boolean = TRUE) : String; overload;
{$IFDEF COMPILER12_UP}
function  UrlDecode(const S: RawByteString;
SrcCodePage: LongWord = CP_ACP;
DetectUtf8: Boolean = TRUE) : UnicodeString; overload;
{$ENDIF}


{$IFDEF COMPILER12_UP}
function UrlDecode(const S: RawByteString; SrcCodePage: LongWord = CP_ACP;
  DetectUtf8: Boolean = TRUE) : UnicodeString;
var
I, J, L : Integer;
U8Str   : AnsiString;
Ch  : AnsiChar;
begin
L := Length(S);
SetLength(U8Str, L);
I := 1;
J := 0;
while (I = L) and (S[I]  '') do begin
Ch := AnsiChar(S[I]);
if Ch = '%' then begin
Ch := AnsiChar(htoi2(PAnsiChar(@S[I + 1])));
Inc(I, 2);
end
else if Ch = '+' then
Ch := ' ';
Inc(J);
U8Str[J] := Ch;
Inc(I);
end;
SetLength(U8Str, J);
if (SrcCodePage = CP_UTF8) or (DetectUtf8 and IsUtf8Valid(U8Str)) then
Result := Utf8ToStringW(U8Str)
else
Result := AnsiToUnicode(U8Str, SrcCodePage);
end;
{$ENDIF}
  

--
To unsubscribe or change your settings for TWSocket mailing list
please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be

Re: [twsocket] Found a bug and made a fix in function UrlDecode

2010-08-05 Thread Bjørnar Nielsen
A little change inside the function also must be made to make it work:

The line with htoi2 need a little change, the complete code is this:

function UrlDecode(const S : AnsiString; SrcCodePage: Cardinal = CP_ACP;
  DetectUtf8: Boolean = TRUE) : String;
var
I, J, L : Integer;
U8Str   : AnsiString;
Ch  : AnsiChar;
begin
L := Length(S);
SetLength(U8Str, L);
I := 1;
J := 0;
while (I = L) and (S[I]  '') do begin
Ch := AnsiChar(S[I]);
if Ch = '%' then begin
Ch := AnsiChar(htoi2(PAnsiChar(@S[I + 1])));
Inc(I, 2);
end
else if Ch = '+' then
Ch := ' ';
Inc(J);
U8Str[J] := Ch;
Inc(I);
end;
SetLength(U8Str, J);
if (SrcCodePage = CP_UTF8) or (DetectUtf8 and IsUtf8Valid(U8Str)) then
{$IFDEF COMPILER12_UP}
Result := Utf8ToStringW(U8Str)
else
Result := AnsiToUnicode(U8Str, SrcCodePage);
{$ELSE}
Result := Utf8ToStringA(U8Str)
else
Result := U8Str;
{$ENDIF}
end;

Regards
Bjørnar

-Original Message-
From: twsocket-boun...@elists.org [mailto:twsocket-boun...@elists.org] On 
Behalf Of Bjørnar Nielsen
Sent: 5. august 2010 12:52
To: ICS support mailing (twsocket@elists.org)
Subject: [twsocket] Found a bug and made a fix in function UrlDecode

Proposal to a fix on bug in UrlDecode in OverbyteIcsUrl.pas and 
OverbyteIcsHttpSrv.pas.

When calling the function like this:
Memo2-Text = UrlDecode(Ã...ge,CP_ACP,false); // Ã...ge is 
Memo2-UTF8encoding of Åge

The resulting text in Memo2 is Ãge and is impossible to UTF8-dekode back to 
the original text.

The fix is to change this:
function UrlDecode(const S : String; SrcCodePage: Cardinal = CP_ACP;
  DetectUtf8: Boolean = TRUE) : String;

To this:
function UrlDecode(const S : AnsiString; SrcCodePage: Cardinal = CP_ACP;
  DetectUtf8: Boolean = TRUE) : String;

Anyone have any comment on this fix?

Regards Bjørnar

--
To unsubscribe or change your settings for TWSocket mailing list please goto 
http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be
--
To unsubscribe or change your settings for TWSocket mailing list
please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be


Re: [twsocket] Found a bug and made a fix in function UrlDecode

2010-08-05 Thread andy
Dear bjor...@sentinel.no,

I will be away on holiday from 27/7/2010 until 10/8/2010 and will be unable to 
deal with your recent message regarding `Re: [twsocket] Found a bug and made a 
fix in function UrlDecode`.

For technical support enquiries please email supp...@ietgroup.com or telephone 
01442 878777.

Best regards,

Andrew Leiper 
IET Ltd
01442 878777



--
To unsubscribe or change your settings for TWSocket mailing list
please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be


Re: [twsocket] Found a bug and made a fix in function UrlDecode

2010-08-05 Thread Arno Garrels
Bjørnar,

 When calling the function like this:
 Memo2-Text = UrlDecode(Ã...ge,CP_ACP,false); // Ã...ge is UTF8encoding of 
 Åge 

Ã...ge is not a valid URL encoded string.

Åge URL encoded was:

%C3%85ge  //UTF-8
%C5ge  //Windows-1252

Try this:
{code}
var
   Str: string;
begin
   Str := 'Åge';
   Str := UrlEncode(Str, CP_UTF8);
   Caption := UrlDecode(Str, CP_UTF8, False);
end;
{code}



-- 
Arno Garrels


--
To unsubscribe or change your settings for TWSocket mailing list
please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be

Re: [twsocket] Found a bug and made a fix in function UrlDecode

2010-08-05 Thread andy
Dear arno.garr...@gmx.de,

I will be away on holiday from 27/7/2010 until 10/8/2010 and will be unable to 
deal with your recent message regarding `Re: [twsocket] Found a bug and made a 
fix in function UrlDecode`.

For technical support enquiries please email supp...@ietgroup.com or telephone 
01442 878777.

Best regards,

Andrew Leiper 
IET Ltd
01442 878777



--
To unsubscribe or change your settings for TWSocket mailing list
please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be


Re: [twsocket] Found a bug and made a fix in function UrlDecode

2010-08-05 Thread Bjørnar Nielsen

 Ã...ge is not a valid URL encoded string.

I know, but it is valid UTF8. I think trying to url-decode it should not break 
the string. I have a webserver that works against different clients, and not 
all of the clients url-encode data in the url. But all of the clients 
UTF8-encode data. That means that if I try to url-decode utf8-data that’s not 
url-encoded, the data gets messed up and I had a problem until I made this fix.

Regards Bjørnar
--
To unsubscribe or change your settings for TWSocket mailing list
please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be

Re: [twsocket] Found a bug and made a fix in function UrlDecode

2010-08-05 Thread andy
Dear bjor...@sentinel.no,

I will be away on holiday from 27/7/2010 until 10/8/2010 and will be unable to 
deal with your recent message regarding `Re: [twsocket] Found a bug and made a 
fix in function UrlDecode`.

For technical support enquiries please email supp...@ietgroup.com or telephone 
01442 878777.

Best regards,

Andrew Leiper 
IET Ltd
01442 878777



--
To unsubscribe or change your settings for TWSocket mailing list
please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be


Re: [twsocket] Found a bug and made a fix in function UrlDecode

2010-08-05 Thread Arno Garrels
Bjørnar, 

 Ã...ge is not a valid URL encoded string.
 
 I know, but it is valid UTF8. I think trying to url-decode it should
 not break the string.

I do not think so. 
Would you expect a correct result as well if you base64-decoded a 
quoted-printable encoded string? An URL containing anything 
else than characters from the printable 7-bit ASCII range is invalid. 
Just like Base64Decode requires properly encoded data to return
correct results UrlDecode requires a valid URL to work correctly.
This requirement has the advantage that it works with string
when both string maps to UnicodeString and to AnsiString
because no implicit string cast will corrupt the string if an 
AnsiString is passed and string maps to UnicodeString. 

However I must admit that it is somehow breaking behavior 
when you port your apps to Unicode. 

 I have a webserver that works against different
 clients, and not all of the clients url-encode data in the url. 

Those clients definitively violate RFC.

 But
 all of the clients UTF8-encode data. That means that if I try to
 url-decode utf8-data that’s not url-encoded, the data gets messed up
 and I had a problem until I made this fix.

Can't you use your own custom function then?

-- 
Arno Garrels 

 


--
To unsubscribe or change your settings for TWSocket mailing list
please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be