Hi Oleg, Thank you for the quick reply. So if there is a possibility that not the whole buffer is filled how can I insure or force HttpClient to fill the whole buffer? Should I maybe avoid Stream Readers all together?
What do you think will be the best route to go? Thank you! -Assaf olegk wrote: > > On Wed, 2010-01-27 at 20:24 -0800, amoldavsky wrote: >> Hi >> >> I have coded a simple file downloader using HttpClient 4.0. >> It works fine but there is something wrong with the String encoding or >> the >> buffer stream. The problem is that there are long sequences of "NULL" >> (ANSI >> code 00) through out the final file, like this: >> http://old.nabble.com/file/p27350930/httpclient_error01.jpg >> http://old.nabble.com/file/p27350930/httpclient_error02.jpg >> >> Here is the main code: >> >> public String getChunk(String url, int bufferSize) throws >> HTTPClientException >> { >> if(!chunkedStarted) >> { >> chunkedIns = getInputStream(url); >> chunkedStarted = true; >> } >> >> byte[] tmp = new byte[bufferSize]; >> try >> { >> if(chunkedIns.read(tmp) != -1) >> { > > What makes you think that the entire buffer will be filled with data? > > Oleg > > >> return new String(tmp); >> } >> else >> { >> finish(); >> return null; >> } >> } >> catch(IOException e) >> { >> HTTPClientException e2 = new HTTPClientException(e.getMessage()); >> e2.setStackTrace(e.getStackTrace()); >> throw e2; >> } >> } >> >> public void finish() >> { >> // do some cleaning >> } >> >> private InputStream getInputStream(String url) throws >> HTTPClientException >> { >> InputStream instream = null; >> >> httpClient = new DefaultHttpClient(); >> httpClient.getParams().setParameter("http.useragent", AGENT_NAME); >> >> HttpGet httpGet = new HttpGet(url); >> HttpResponse response = null; >> >> try >> { >> response = httpClient.execute(httpGet); >> HttpEntity entity = response.getEntity(); >> >> if(entity != null) >> { >> instream = entity.getContent(); >> } >> } >> catch(ClientProtocolException e) >> { >> HTTPClientException e2 = new HTTPClientException(e.getMessage()); >> e2.setStackTrace(e.getStackTrace()); >> throw e2; >> } >> catch(IOException e) >> { >> HTTPClientException e2 = new HTTPClientException(e.getMessage()); >> e2.setStackTrace(e.getStackTrace()); >> throw e2; >> } >> >> return instream; >> } >> >> getChuck and getInputStream can basically be one method but I just have >> the >> need to split them for internal conveniece, that does not change the >> funtionality as a whole. >> >> It seems like either the conversion from bytes to string is a problem: >> return new String(tmp); >> >> or that the buffer is not getting filled to the end. The latter could not >> be >> possible because the files are ~30MB each and the buffer size is 2Kb. >> >> I have attached the file, it's a CSV (shortened to ~6KB), note that long >> white space between some of the URLs, if you just remove it, the URL >> makes >> sense. >> http://old.nabble.com/file/p27350930/datafeed.csv datafeed.csv >> >> Where can this white space come (null) from?? >> >> thank! > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > > -- View this message in context: http://old.nabble.com/HttpClient-4.0-encoding-madness-tp27350930p27366928.html Sent from the HttpClient-User mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
