On Wed, 2010-01-27 at 20:24 -0800, amoldavsky wrote:
> Hi
> 
> I have coded a simple file downloader using HttpClient 4.0.
> It works fine but there is something wrong with the String encoding or the
> buffer stream. The problem is that there are long sequences of "NULL" (ANSI
> code 00) through out the final file, like this:
> http://old.nabble.com/file/p27350930/httpclient_error01.jpg 
> http://old.nabble.com/file/p27350930/httpclient_error02.jpg 
> 
> Here is the main code:
> 
> public String getChunk(String url, int bufferSize) throws
> HTTPClientException
>   {
>     if(!chunkedStarted)
>     {
>       chunkedIns = getInputStream(url);
>       chunkedStarted = true;
>     }
>     
>     byte[] tmp = new byte[bufferSize];
>     try
>     {
>       if(chunkedIns.read(tmp) != -1)
>       {

What makes you think that the entire buffer will be filled with data?

Oleg 


>         return new String(tmp);
>       }
>       else
>       {
>         finish();
>         return null;
>       }
>     }
>     catch(IOException e)
>     {
>       HTTPClientException e2 = new HTTPClientException(e.getMessage());
>       e2.setStackTrace(e.getStackTrace());
>       throw e2;
>     }
>   }
>   
>   public void finish()
>   {
>     // do some cleaning
>   }
> 
>    private InputStream getInputStream(String url) throws HTTPClientException
>   {
>     InputStream instream = null;
>     
>     httpClient = new DefaultHttpClient();
>     httpClient.getParams().setParameter("http.useragent", AGENT_NAME);
>     
>     HttpGet httpGet = new HttpGet(url);
>     HttpResponse response = null;
>     
>     try
>     {
>       response = httpClient.execute(httpGet);
>       HttpEntity entity = response.getEntity();
>     
>       if(entity != null) 
>       {
>         instream = entity.getContent();
>       }
>     }
>     catch(ClientProtocolException e)
>     {
>       HTTPClientException e2 = new HTTPClientException(e.getMessage());
>       e2.setStackTrace(e.getStackTrace());
>       throw e2;
>     }
>     catch(IOException e)
>     {
>       HTTPClientException e2 = new HTTPClientException(e.getMessage());
>       e2.setStackTrace(e.getStackTrace());
>       throw e2;
>     }
>     
>     return instream;
>   }
> 
> getChuck and getInputStream can basically be one method but I just have the
> need to split them for internal conveniece, that does not change the
> funtionality as a whole.
> 
> It seems like either the conversion from bytes to string is a problem:
> return new String(tmp);
> 
> or that the buffer is not getting filled to the end. The latter could not be
> possible because the files are ~30MB each and the buffer size is 2Kb.
> 
> I have attached the file, it's a CSV (shortened to ~6KB), note that long
> white space between some of the URLs, if you just remove it, the URL makes
> sense.
> http://old.nabble.com/file/p27350930/datafeed.csv datafeed.csv 
> 
> Where can this white space come (null) from??
> 
> thank!



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to