Re: Error trying to download a Gzip file.

2013-04-03 Thread sebb
On 2 April 2013 22:28, S.L simpleliving...@gmail.com wrote:

 Thanks Stephen .I completely agree with you , I am just accessing publicly
 available data , one thing that I would like to mention though is that I
 get the following warning every time this happens .

 *Invalid cookie header: Set-Cookie: . Cookie name may not be empty*.

 Even though I have  set a lenient cookie -policy as illustrated here in
 section 3.5 .


 http://hc.apache.org/httpcomponents-client-ga/tutorial/html/statemgmt.html#d4e777



Even a lenient cookie policy cannot allow a cookie with no name.

Sounds like the server is having problems processing your request and is
returning invalid  HTTP headers as a result.

I suggest you capture the headers and check.




 On Tue, Apr 2, 2013 at 4:19 PM, Stephen J. Butler
 stephen.but...@gmail.comwrote:

  This really doesn't sound like a problem with HttpClient. Rather, an
 issue
  with Walmart's servers. Maybe you're querying the server too often and
  they're rate limiting you. Or if the same pattern always works in a
 browser
  then they're sniffing your UserAgent and doing something different.
 
  In any event, if this is an approved API for Walmart.com, you need to
  contact them. Or just cache the response yourself in the code and use a
  cached value when you get a 404. There's nothing wrong with HttpClient.
  It's doing exactly what the server told it to do.
 
 
 
  On Tue, Apr 2, 2013 at 3:12 PM, S.L simpleliving...@gmail.com wrote:
 
   Stephen,
  
   Thanks, I now checked the reason phrase and it returns Not Found
 when I
   get a non gzip page not found error and an OK when I successfully get a
   gzip file download. As I said earlier in multiple iterations , the
 first
   one is successful almost  all the time and any subsequent request after
   that can fail. Is there any way I can stick to the fist successful
  server
   instance using the HttpClient , is there any way around  ? Thanks for
  your
   help.
  
  
   On Mon, Apr 1, 2013 at 11:48 PM, Stephen J. Butler 
   stephen.but...@gmail.com
wrote:
  
I notice you aren't checking your status code. Error response have a
body/response entity too. Take a look at
response.getStatusLine().getStatusCode() and
 getStatusReasonPhrase(). I
   bet
the server is limiting you in the instances where you're seeing
   non-gziped
content.
   
   
On Mon, Apr 1, 2013 at 10:42 PM, S.L simpleliving...@gmail.com
  wrote:
   
 Hi All,

 I use the following code and run the method multiple times, a few
   times I
 get a response in gzip which is what I expect and a few time I get
 a
 response that is completely different(non Gzip and html format)
   .However
if
 I download the same URL multiple times using Mozilla or IE I
   consistently
 get the same GZIP response , Is this an error with the server I am
   trying
 to reach to , or do I need to set parameters to get a consistent
response ?

 Thanks.

 The URL I am trying to download is *
 http://www.walmart.com/navigation6.xml.gz* , can you please let me
   know
?
 Thanks




   public static byte[] dowloadURL(URL urlToDownload) {

 InputStream iStream = null;
 byte[] urlBytes = null;

 try {

 //HttpClient httpClient = new HttpClient();
 org.apache.http.client.
 HttpClient httpClient = new DefaultHttpClient();


 HttpGet httpget = new
 HttpGet(urlToDownload.toString());

 HttpResponse response  = httpClient.execute(httpget);

 iStream = response.getEntity().getContent();

 urlBytes = IOUtils.toByteArray(iStream);
 String responseString = new String(urlBytes);
 System.out.println(  The response  string for 
 +urlToDownload.toString()+   is+responseString);

 } catch (IOException e) {
 System.err.printf(Failed while reading bytes from %s:
  %s,
 urlToDownload.toExternalForm(),
 e.getMessage());
 e.printStackTrace();
 // Perform any other exception handling that's
  appropriate.
 } finally {
 if (iStream != null) {
 try {
 iStream.close();
 } catch (IOException e) {
 // TODO Auto-generated catch block
 e.printStackTrace();
 }
 }
 }

 return urlBytes;
 }

   
  
 



Re: Error trying to download a Gzip file.

2013-04-02 Thread S.L
Stephen,

Thanks, I now checked the reason phrase and it returns Not Found when I
get a non gzip page not found error and an OK when I successfully get a
gzip file download. As I said earlier in multiple iterations , the first
one is successful almost  all the time and any subsequent request after
that can fail. Is there any way I can stick to the fist successful server
instance using the HttpClient , is there any way around  ? Thanks for your
help.


On Mon, Apr 1, 2013 at 11:48 PM, Stephen J. Butler stephen.but...@gmail.com
 wrote:

 I notice you aren't checking your status code. Error response have a
 body/response entity too. Take a look at
 response.getStatusLine().getStatusCode() and getStatusReasonPhrase(). I bet
 the server is limiting you in the instances where you're seeing non-gziped
 content.


 On Mon, Apr 1, 2013 at 10:42 PM, S.L simpleliving...@gmail.com wrote:

  Hi All,
 
  I use the following code and run the method multiple times, a few times I
  get a response in gzip which is what I expect and a few time I get a
  response that is completely different(non Gzip and html format) .However
 if
  I download the same URL multiple times using Mozilla or IE I consistently
  get the same GZIP response , Is this an error with the server I am trying
  to reach to , or do I need to set parameters to get a consistent
 response ?
 
  Thanks.
 
  The URL I am trying to download is *
  http://www.walmart.com/navigation6.xml.gz* , can you please let me know
 ?
  Thanks
 
 
 
 
public static byte[] dowloadURL(URL urlToDownload) {
 
  InputStream iStream = null;
  byte[] urlBytes = null;
 
  try {
 
  //HttpClient httpClient = new HttpClient();
  org.apache.http.client.
  HttpClient httpClient = new DefaultHttpClient();
 
 
  HttpGet httpget = new HttpGet(urlToDownload.toString());
 
  HttpResponse response  = httpClient.execute(httpget);
 
  iStream = response.getEntity().getContent();
 
  urlBytes = IOUtils.toByteArray(iStream);
  String responseString = new String(urlBytes);
  System.out.println(  The response  string for 
  +urlToDownload.toString()+   is+responseString);
 
  } catch (IOException e) {
  System.err.printf(Failed while reading bytes from %s: %s,
  urlToDownload.toExternalForm(), e.getMessage());
  e.printStackTrace();
  // Perform any other exception handling that's appropriate.
  } finally {
  if (iStream != null) {
  try {
  iStream.close();
  } catch (IOException e) {
  // TODO Auto-generated catch block
  e.printStackTrace();
  }
  }
  }
 
  return urlBytes;
  }
 



Re: Error trying to download a Gzip file.

2013-04-02 Thread Stephen J. Butler
This really doesn't sound like a problem with HttpClient. Rather, an issue
with Walmart's servers. Maybe you're querying the server too often and
they're rate limiting you. Or if the same pattern always works in a browser
then they're sniffing your UserAgent and doing something different.

In any event, if this is an approved API for Walmart.com, you need to
contact them. Or just cache the response yourself in the code and use a
cached value when you get a 404. There's nothing wrong with HttpClient.
It's doing exactly what the server told it to do.



On Tue, Apr 2, 2013 at 3:12 PM, S.L simpleliving...@gmail.com wrote:

 Stephen,

 Thanks, I now checked the reason phrase and it returns Not Found when I
 get a non gzip page not found error and an OK when I successfully get a
 gzip file download. As I said earlier in multiple iterations , the first
 one is successful almost  all the time and any subsequent request after
 that can fail. Is there any way I can stick to the fist successful server
 instance using the HttpClient , is there any way around  ? Thanks for your
 help.


 On Mon, Apr 1, 2013 at 11:48 PM, Stephen J. Butler 
 stephen.but...@gmail.com
  wrote:

  I notice you aren't checking your status code. Error response have a
  body/response entity too. Take a look at
  response.getStatusLine().getStatusCode() and getStatusReasonPhrase(). I
 bet
  the server is limiting you in the instances where you're seeing
 non-gziped
  content.
 
 
  On Mon, Apr 1, 2013 at 10:42 PM, S.L simpleliving...@gmail.com wrote:
 
   Hi All,
  
   I use the following code and run the method multiple times, a few
 times I
   get a response in gzip which is what I expect and a few time I get a
   response that is completely different(non Gzip and html format)
 .However
  if
   I download the same URL multiple times using Mozilla or IE I
 consistently
   get the same GZIP response , Is this an error with the server I am
 trying
   to reach to , or do I need to set parameters to get a consistent
  response ?
  
   Thanks.
  
   The URL I am trying to download is *
   http://www.walmart.com/navigation6.xml.gz* , can you please let me
 know
  ?
   Thanks
  
  
  
  
 public static byte[] dowloadURL(URL urlToDownload) {
  
   InputStream iStream = null;
   byte[] urlBytes = null;
  
   try {
  
   //HttpClient httpClient = new HttpClient();
   org.apache.http.client.
   HttpClient httpClient = new DefaultHttpClient();
  
  
   HttpGet httpget = new HttpGet(urlToDownload.toString());
  
   HttpResponse response  = httpClient.execute(httpget);
  
   iStream = response.getEntity().getContent();
  
   urlBytes = IOUtils.toByteArray(iStream);
   String responseString = new String(urlBytes);
   System.out.println(  The response  string for 
   +urlToDownload.toString()+   is+responseString);
  
   } catch (IOException e) {
   System.err.printf(Failed while reading bytes from %s: %s,
   urlToDownload.toExternalForm(), e.getMessage());
   e.printStackTrace();
   // Perform any other exception handling that's appropriate.
   } finally {
   if (iStream != null) {
   try {
   iStream.close();
   } catch (IOException e) {
   // TODO Auto-generated catch block
   e.printStackTrace();
   }
   }
   }
  
   return urlBytes;
   }
  
 



Re: Error trying to download a Gzip file.

2013-04-02 Thread S.L
Thanks Stephen .I completely agree with you , I am just accessing publicly
available data , one thing that I would like to mention though is that I
get the following warning every time this happens .

*Invalid cookie header: Set-Cookie: . Cookie name may not be empty*.

Even though I have  set a lenient cookie -policy as illustrated here in
section 3.5 .

http://hc.apache.org/httpcomponents-client-ga/tutorial/html/statemgmt.html#d4e777






On Tue, Apr 2, 2013 at 4:19 PM, Stephen J. Butler
stephen.but...@gmail.comwrote:

 This really doesn't sound like a problem with HttpClient. Rather, an issue
 with Walmart's servers. Maybe you're querying the server too often and
 they're rate limiting you. Or if the same pattern always works in a browser
 then they're sniffing your UserAgent and doing something different.

 In any event, if this is an approved API for Walmart.com, you need to
 contact them. Or just cache the response yourself in the code and use a
 cached value when you get a 404. There's nothing wrong with HttpClient.
 It's doing exactly what the server told it to do.



 On Tue, Apr 2, 2013 at 3:12 PM, S.L simpleliving...@gmail.com wrote:

  Stephen,
 
  Thanks, I now checked the reason phrase and it returns Not Found when I
  get a non gzip page not found error and an OK when I successfully get a
  gzip file download. As I said earlier in multiple iterations , the first
  one is successful almost  all the time and any subsequent request after
  that can fail. Is there any way I can stick to the fist successful
 server
  instance using the HttpClient , is there any way around  ? Thanks for
 your
  help.
 
 
  On Mon, Apr 1, 2013 at 11:48 PM, Stephen J. Butler 
  stephen.but...@gmail.com
   wrote:
 
   I notice you aren't checking your status code. Error response have a
   body/response entity too. Take a look at
   response.getStatusLine().getStatusCode() and getStatusReasonPhrase(). I
  bet
   the server is limiting you in the instances where you're seeing
  non-gziped
   content.
  
  
   On Mon, Apr 1, 2013 at 10:42 PM, S.L simpleliving...@gmail.com
 wrote:
  
Hi All,
   
I use the following code and run the method multiple times, a few
  times I
get a response in gzip which is what I expect and a few time I get a
response that is completely different(non Gzip and html format)
  .However
   if
I download the same URL multiple times using Mozilla or IE I
  consistently
get the same GZIP response , Is this an error with the server I am
  trying
to reach to , or do I need to set parameters to get a consistent
   response ?
   
Thanks.
   
The URL I am trying to download is *
http://www.walmart.com/navigation6.xml.gz* , can you please let me
  know
   ?
Thanks
   
   
   
   
  public static byte[] dowloadURL(URL urlToDownload) {
   
InputStream iStream = null;
byte[] urlBytes = null;
   
try {
   
//HttpClient httpClient = new HttpClient();
org.apache.http.client.
HttpClient httpClient = new DefaultHttpClient();
   
   
HttpGet httpget = new HttpGet(urlToDownload.toString());
   
HttpResponse response  = httpClient.execute(httpget);
   
iStream = response.getEntity().getContent();
   
urlBytes = IOUtils.toByteArray(iStream);
String responseString = new String(urlBytes);
System.out.println(  The response  string for 
+urlToDownload.toString()+   is+responseString);
   
} catch (IOException e) {
System.err.printf(Failed while reading bytes from %s:
 %s,
urlToDownload.toExternalForm(), e.getMessage());
e.printStackTrace();
// Perform any other exception handling that's
 appropriate.
} finally {
if (iStream != null) {
try {
iStream.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
   
return urlBytes;
}
   
  
 



Re: Error trying to download a Gzip file.

2013-04-01 Thread Stephen J. Butler
I notice you aren't checking your status code. Error response have a
body/response entity too. Take a look at
response.getStatusLine().getStatusCode() and getStatusReasonPhrase(). I bet
the server is limiting you in the instances where you're seeing non-gziped
content.


On Mon, Apr 1, 2013 at 10:42 PM, S.L simpleliving...@gmail.com wrote:

 Hi All,

 I use the following code and run the method multiple times, a few times I
 get a response in gzip which is what I expect and a few time I get a
 response that is completely different(non Gzip and html format) .However if
 I download the same URL multiple times using Mozilla or IE I consistently
 get the same GZIP response , Is this an error with the server I am trying
 to reach to , or do I need to set parameters to get a consistent response ?

 Thanks.

 The URL I am trying to download is *
 http://www.walmart.com/navigation6.xml.gz* , can you please let me know ?
 Thanks




   public static byte[] dowloadURL(URL urlToDownload) {

 InputStream iStream = null;
 byte[] urlBytes = null;

 try {

 //HttpClient httpClient = new HttpClient();
 org.apache.http.client.
 HttpClient httpClient = new DefaultHttpClient();


 HttpGet httpget = new HttpGet(urlToDownload.toString());

 HttpResponse response  = httpClient.execute(httpget);

 iStream = response.getEntity().getContent();

 urlBytes = IOUtils.toByteArray(iStream);
 String responseString = new String(urlBytes);
 System.out.println(  The response  string for 
 +urlToDownload.toString()+   is+responseString);

 } catch (IOException e) {
 System.err.printf(Failed while reading bytes from %s: %s,
 urlToDownload.toExternalForm(), e.getMessage());
 e.printStackTrace();
 // Perform any other exception handling that's appropriate.
 } finally {
 if (iStream != null) {
 try {
 iStream.close();
 } catch (IOException e) {
 // TODO Auto-generated catch block
 e.printStackTrace();
 }
 }
 }

 return urlBytes;
 }