bug#38299: A bug while trying to decode a non encode base64

2019-11-21 Thread Erik Auerswald
Hi,

On Thu, Nov 21, 2019 at 12:04:11PM +0530, vardhaman narasagoudar wrote:
> On Thu, Nov 21, 2019 at 12:51 AM Paul Eggert  wrote:
> > On 11/20/19 6:22 AM, Martin Schulte wrote:
> > > vardhamanbn1 is a valid encoding
> >
> > Thanks for explaining; closing the bug report.
> 
> Thanks for replying the query, but if I check online (
> https://www.base64decode.org/) for decoding  the same in online .
> 
> I get  an error  message (which is valid) e.g:-
> 
> 1) if I try to decode "99"  I get an error message
> 
> "No printable characters found, try another source charset, or upload your
> data as a file for binary decoding."

The error message says that the decoded data is not printable.  It does
not say anything about invalid input data, although the input data is
not correctly Base64 encoded.

> Similarly we got return code as 1 "invalid input" in the terminal.
> 
> 2) Now if I try to decode "vardhamanbn1" I get the error message  (any 12
> characters or multiple of 12 characters which is a non-encoded value, if
> try to decode)
> "No printable characters found, try another source charset, or upload your
> data as a file for binary decoding."

You get the same error message about the decoded data.  This is correct.
The site even tells you that the interface you use does not support
binary, i.e., non-printable data.

> But when we try the same in terminal , we get the return code as 0 the
> symbol as per inputs given
>  "UTF-8 and thus leads to �."
> 
> Now as a work around we are using

That is not a workaround, but the necessary check for valid output data
for your application, since you seem to require a Base64 encoding of
UTF-8 data.

> a) [vardhaman@oc6085028360 ~]$ echo -n "vardhamanbn1" | base64 -d | iconv
> -f utf8
> iconv: illegal input sequence at position 0

Base64 can encode any binary data, not just valid UTF-8 text.

> also we tried on another sample
> 
> b) [vardhaman@oc6085028360 ~]$ echo  -n '99' | base64 -d | iconv -f utf8
> base64: invalid input
> iconv: illegal input sequence at position 0
> 
> without using "iconv -f utf8"
> 
> [vardhaman@oc6085028360 ~]$  echo  -n '99' | base64 -d
> base64: invalid input
> 
> 
> So we feel its something still with 12 & multiple of 12 characters leading
> to the issue, when we try to decode a non-decode value.

The magic number is actually 4, because each symbol in a base64 encoded
string represents 6 bits, thus 4 symbols give you 3 bytes of encoded data.
Any combination of Base64 symbols that forms a string of a length
divisibale by 4 is a valid Base64 encoding.  This does not give any
guarantee about the data.

> Or should we think characters of multiple of 12 will be treated as a base64
> format

Yes. Actually, any multiple of 4 characters.

>  e.g when I tried decoding 24 non-encode character:-
>  [vardhaman@oc6085028360 ~]$ echo -n 'vardhamanbn1vardhamanbn1' | base64
> --decode
> ��݅���݅�[vardhaman@oc6085028360 ~]$ echo $?
> 0

Thanks,
Erik
-- 
The laws of mathematics are very commendable, but the only law that
applies in Australia is the law of Australia.
-- Australian Prime Minister Malcolm Turnbull





bug#38299: A bug while trying to decode a non encode base64

2019-11-20 Thread vardhaman narasagoudar
Hi Team,

Thanks for replying the query, but if I check online (
https://www.base64decode.org/) for decoding  the same in online .

I get  an error  message (which is valid) e.g:-

1) if I try to decode "99"  I get an error message

"No printable characters found, try another source charset, or upload your
data as a file for binary decoding."

Similarly we got return code as 1 "invalid input" in the terminal.

2) Now if I try to decode "vardhamanbn1" I get the error message  (any 12
characters or multiple of 12 characters which is a non-encoded value, if
try to decode)
"No printable characters found, try another source charset, or upload your
data as a file for binary decoding."

But when we try the same in terminal , we get the return code as 0 the
symbol as per inputs given
 "UTF-8 and thus leads to �."

Now as a work around we are using
a) [vardhaman@oc6085028360 ~]$ echo -n "vardhamanbn1" | base64 -d | iconv
-f utf8
iconv: illegal input sequence at position 0

also we tried on another sample

b) [vardhaman@oc6085028360 ~]$ echo  -n '99' | base64 -d | iconv -f utf8
base64: invalid input
iconv: illegal input sequence at position 0

without using "iconv -f utf8"

[vardhaman@oc6085028360 ~]$  echo  -n '99' | base64 -d
base64: invalid input


So we feel its something still with 12 & multiple of 12 characters leading
to the issue, when we try to decode a non-decode value.
Or should we think characters of multiple of 12 will be treated as a base64
format

 e.g when I tried decoding 24 non-encode character:-
 [vardhaman@oc6085028360 ~]$ echo -n 'vardhamanbn1vardhamanbn1' | base64
--decode
��݅���݅�[vardhaman@oc6085028360 ~]$ echo $?
0



On Thu, Nov 21, 2019 at 12:51 AM Paul Eggert  wrote:

> On 11/20/19 6:22 AM, Martin Schulte wrote:
> > vardhamanbn1 is a valid encoding
>
> Thanks for explaining; closing the bug report.
>


-- 
Thanks & Regards
Vardhaman B.N


bug#38299: A bug while trying to decode a non encode base64

2019-11-20 Thread Paul Eggert

On 11/20/19 6:22 AM, Martin Schulte wrote:

vardhamanbn1 is a valid encoding


Thanks for explaining; closing the bug report.





bug#38299: A bug while trying to decode a non encode base64

2019-11-20 Thread Martin Schulte
Hello Vardhaman!

> 3) Now trying to decode a non-encoded value of 12characters
> [vardhaman@oc6085028360 ~]$ echo  'vardhamanbn1' | base64 --decode
> ��݅�[vardhaman@oc6085028360 ~]$ echo $?
> 0

$ echo -n $'\275\252\335\205\251\232\235\271\365' | base64
vardhamanbn1

So, vardhamanbn1 is a valid encoding, but the decoded data is not UTF-8
and thus leads to �.

Best regards,

Martin





bug#38299: A bug while trying to decode a non encode base64

2019-11-20 Thread vardhaman narasagoudar
Hi Team,

Looks like there is a bug while trying to decode a non-encode base64 for 12
characters (or multiple of 12) , the return code is always 0.

e.g:-

1) When trying to decode a encoded value
[vardhaman@oc6085028360 ~]$ echo  'Nzc3Nzk5Cg==' | base64 --decode
99
[vardhaman@oc6085028360 ~]$ echo $?
0

2)   A sample when trying to decode a non-encode value
[vardhaman@oc6085028360 ~]$ echo  '99' | base64 --decode
base64: invalid input
[vardhaman@oc6085028360 ~]$ echo $?
1


3) Now trying to decode a non-encoded value of 12characters
[vardhaman@oc6085028360 ~]$ echo  'vardhamanbn1' | base64 --decode
��݅�[vardhaman@oc6085028360 ~]$ echo $?
0

The point 3 should return code as 1 , as invalid input

I feel this bug is present in all version , anywhere sharing the current
version where I tested

vardhaman@oc6085028360 ~]$ base64 --version
base64 (GNU coreutils) 8.22
Copyright (C) 2013 Free Software Foundation, Inc.



-- 
Thanks & Regards
Vardhaman B.N
9945840928