Hi Mark,

Thanks for the prompt response.

>On 24/05/2021 10:58, Scott,Tim wrote:
>> Hi experts,
>>
>> First time poster, here, so I know I'm risking not providing nearly
>> enough of the right information. Please let me know what I can send to
>> help you help me further through this.

>How are you reading the uploaded file? Please provide the code that does this.
I am reading the InputStream as below:
               (merged from two classes, untested, incomplete)

import org.apache.commons.fileupload.disk.DiskFileItemFactory;
import org.apache.commons.fileupload.servlet.ServletFileUpload;
import org.apache.commons.fileupload.servlet.ServletRequestContext;
import javax.servlet.http.HttpServletRequest;
import java.io.InputStream;
import org.apache.commons.fileupload.FileItem;

ServletRequestContext requestContext = new ServletRequestContext(/* 
HttpServletRequest  */ request);
FileItemFactory factory = new DiskFileItemFactory();
FileUpload fileUpload = new ServletFileUpload(factory);
List<FileItem> entries = fileUpload.parseRequest(requestContext); // <<< this 
call generates the temp file
InputStream inputStream;
for (FileItem item : entries)
{
               if (!item.isFormField())
               {
                              inputStream = item.getInputStream();
                              }
               }
               ...
byte[] buffer = new byte[BINARY_BUFFER_SIZE];
bolean eof = false;
while (!eof)
{
int count = inputStream.read(buffer);
if (count == -1)
{
eof = true;
...
                              }
                              ...
               }

Similarly, I am not writing the temp file. I understand that this is done by 
DeferredFileOutputStream as part of the call to ServletFileUpload's 
parseRequest(); The temp file (if created) and the input stream already contain 
corrupted data.

> The only way the default encoding should impact things is if the file bytes 
> are being converted to String at some point.
Not by me, they're not!

> That shouldn't normally happen for an uploaded file.
I agree, it shouldn't. That does not match, however, my finding that:
               Using -Dfile.encoding=utf-8 on Windows corrupts the file.
               Using -Dfile.encoding=ISO-8859-1 on Linux stops the file 
corruption.

Thanks,
Tim

Reply via email to