On 10/25/13 2:19 PM, Bill Shannon wrote:
If I understand this correctly, this proposes to remove the "lenient" option we've been discussing and just make it always lenient. Is that correct?
Yes. Only for the mime type though.
Unfortunately, from what you say below, it's still not lenient enough. I'd really like a version that never, ever, for any reason, throws an exception. Yes, that means when you only get a final 6 bits of data you have to make an assumption about what was intended, probably padding it with zeros to 8 bits.
This is something I'm hesitated to do. I can be lenient for the padding end because the padding character itself is not the real "data", whether or not it's present, it's missing or it's incorrect/incomplete, it does not impact the integrity of the data. But to feed the last
6 bits with zero, is really kinda of guessing, NOT decoding. -Sherman
Xueming Shen wrote on 10/23/13 15:42:Hi, The current spec and implementation of Base64 decoder [1] is standard/rfc based, in which it interprets/decodes the ending padding character(s) correctly if present. The ending padding character(s) is not requested (liberal), but if present, the spec and implementation requests they MUST be encoded correctly, any incorrect padding combination at the final unit (as listed below) is treated as incorrect encoded base64 data and results in exception. Patterns of possible incorrectly encoded padding final base64 unit are: xxxx = unnecessary padding character at the end of encoded stream xxxx xx= missing the last padding character xxxx xx=y missing the last padding character, instead having a non-padding char The feedback we got so far suggests that "incorrectly encoded padding unit" might might be frequently observed in real world use scenario, especially in the MIME/email world, it might be desired to just accept these incorrectly encoded ending unit and decode the rest successfully without throwing an exception. It is also suggested it might be more appropriate to rename Base64.getEncoder(int lineLength, byte[] sept) to be Base64.getMimeEncoder(int, byte[]). The proposed changes here are to (1) rename the factory method for the customizable "mime" encoder to Base.getMimeEncoder(int, byte[]); (2) change the spec/implementation for the "mime" decoder to be lenient when handing the padding character in the final unit (mime decoder itself is "lenient" already. Its spec requests any non-base64 character during encoding. And our existing decoder is liberal when there is no padding present at all) Here is the webrev http://cr.openjdk.java.net/~sherman/8025003/webrev/ thanks! -Sherman Btw, updated mime decoder stilll throws exception for pattern like "xxxx x=..." or "xxxx x", in which the last unit only has one valid "byte"/6-bit data. It's not sufficient to be decoded into a valid 8-bit/byte data.
