Re: HttpClient 4.0 encoding madness

amoldavsky Thu, 28 Jan 2010 22:10:24 -0800

Hi Oleg,
Thank you for the quick reply.

So if there is a possibility that not the whole buffer is filled how can I
insure or force HttpClient to fill the whole buffer? Should I maybe avoid
Stream Readers all together?


What do you think will be the best route to go?

Thank you!
-Assaf


olegk wrote:
> 
> On Wed, 2010-01-27 at 20:24 -0800, amoldavsky wrote:
>> Hi
>> 
>> I have coded a simple file downloader using HttpClient 4.0.
>> It works fine but there is something wrong with the String encoding or
>> the
>> buffer stream. The problem is that there are long sequences of "NULL"
>> (ANSI
>> code 00) through out the final file, like this:
>> http://old.nabble.com/file/p27350930/httpclient_error01.jpg 
>> http://old.nabble.com/file/p27350930/httpclient_error02.jpg 
>> 
>> Here is the main code:
>> 
>> public String getChunk(String url, int bufferSize) throws
>> HTTPClientException
>>   {
>>     if(!chunkedStarted)
>>     {
>>       chunkedIns = getInputStream(url);
>>       chunkedStarted = true;
>>     }
>>     
>>     byte[] tmp = new byte[bufferSize];
>>     try
>>     {
>>       if(chunkedIns.read(tmp) != -1)
>>       {
> 
> What makes you think that the entire buffer will be filled with data?
> 
> Oleg 
> 
> 
>>         return new String(tmp);
>>       }
>>       else
>>       {
>>         finish();
>>         return null;
>>       }
>>     }
>>     catch(IOException e)
>>     {
>>       HTTPClientException e2 = new HTTPClientException(e.getMessage());
>>       e2.setStackTrace(e.getStackTrace());
>>       throw e2;
>>     }
>>   }
>>   
>>   public void finish()
>>   {
>>     // do some cleaning
>>   }
>> 
>>    private InputStream getInputStream(String url) throws
>> HTTPClientException
>>   {
>>     InputStream instream = null;
>>     
>>     httpClient = new DefaultHttpClient();
>>     httpClient.getParams().setParameter("http.useragent", AGENT_NAME);
>>     
>>     HttpGet httpGet = new HttpGet(url);
>>     HttpResponse response = null;
>>     
>>     try
>>     {
>>       response = httpClient.execute(httpGet);
>>       HttpEntity entity = response.getEntity();
>>     
>>       if(entity != null) 
>>       {
>>         instream = entity.getContent();
>>       }
>>     }
>>     catch(ClientProtocolException e)
>>     {
>>       HTTPClientException e2 = new HTTPClientException(e.getMessage());
>>       e2.setStackTrace(e.getStackTrace());
>>       throw e2;
>>     }
>>     catch(IOException e)
>>     {
>>       HTTPClientException e2 = new HTTPClientException(e.getMessage());
>>       e2.setStackTrace(e.getStackTrace());
>>       throw e2;
>>     }
>>     
>>     return instream;
>>   }
>> 
>> getChuck and getInputStream can basically be one method but I just have
>> the
>> need to split them for internal conveniece, that does not change the
>> funtionality as a whole.
>> 
>> It seems like either the conversion from bytes to string is a problem:
>> return new String(tmp);
>> 
>> or that the buffer is not getting filled to the end. The latter could not
>> be
>> possible because the files are ~30MB each and the buffer size is 2Kb.
>> 
>> I have attached the file, it's a CSV (shortened to ~6KB), note that long
>> white space between some of the URLs, if you just remove it, the URL
>> makes
>> sense.
>> http://old.nabble.com/file/p27350930/datafeed.csv datafeed.csv 
>> 
>> Where can this white space come (null) from??
>> 
>> thank!
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
> 
> 
> 

-- 
View this message in context: 
http://old.nabble.com/HttpClient-4.0-encoding-madness-tp27350930p27366928.html
Sent from the HttpClient-User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: HttpClient 4.0 encoding madness

Reply via email to