Re: Error trying to download a Gzip file.
On 2 April 2013 22:28, S.L simpleliving...@gmail.com wrote: Thanks Stephen .I completely agree with you , I am just accessing publicly available data , one thing that I would like to mention though is that I get the following warning every time this happens . *Invalid cookie header: Set-Cookie: . Cookie name may not be empty*. Even though I have set a lenient cookie -policy as illustrated here in section 3.5 . http://hc.apache.org/httpcomponents-client-ga/tutorial/html/statemgmt.html#d4e777 Even a lenient cookie policy cannot allow a cookie with no name. Sounds like the server is having problems processing your request and is returning invalid HTTP headers as a result. I suggest you capture the headers and check. On Tue, Apr 2, 2013 at 4:19 PM, Stephen J. Butler stephen.but...@gmail.comwrote: This really doesn't sound like a problem with HttpClient. Rather, an issue with Walmart's servers. Maybe you're querying the server too often and they're rate limiting you. Or if the same pattern always works in a browser then they're sniffing your UserAgent and doing something different. In any event, if this is an approved API for Walmart.com, you need to contact them. Or just cache the response yourself in the code and use a cached value when you get a 404. There's nothing wrong with HttpClient. It's doing exactly what the server told it to do. On Tue, Apr 2, 2013 at 3:12 PM, S.L simpleliving...@gmail.com wrote: Stephen, Thanks, I now checked the reason phrase and it returns Not Found when I get a non gzip page not found error and an OK when I successfully get a gzip file download. As I said earlier in multiple iterations , the first one is successful almost all the time and any subsequent request after that can fail. Is there any way I can stick to the fist successful server instance using the HttpClient , is there any way around ? Thanks for your help. On Mon, Apr 1, 2013 at 11:48 PM, Stephen J. Butler stephen.but...@gmail.com wrote: I notice you aren't checking your status code. Error response have a body/response entity too. Take a look at response.getStatusLine().getStatusCode() and getStatusReasonPhrase(). I bet the server is limiting you in the instances where you're seeing non-gziped content. On Mon, Apr 1, 2013 at 10:42 PM, S.L simpleliving...@gmail.com wrote: Hi All, I use the following code and run the method multiple times, a few times I get a response in gzip which is what I expect and a few time I get a response that is completely different(non Gzip and html format) .However if I download the same URL multiple times using Mozilla or IE I consistently get the same GZIP response , Is this an error with the server I am trying to reach to , or do I need to set parameters to get a consistent response ? Thanks. The URL I am trying to download is * http://www.walmart.com/navigation6.xml.gz* , can you please let me know ? Thanks public static byte[] dowloadURL(URL urlToDownload) { InputStream iStream = null; byte[] urlBytes = null; try { //HttpClient httpClient = new HttpClient(); org.apache.http.client. HttpClient httpClient = new DefaultHttpClient(); HttpGet httpget = new HttpGet(urlToDownload.toString()); HttpResponse response = httpClient.execute(httpget); iStream = response.getEntity().getContent(); urlBytes = IOUtils.toByteArray(iStream); String responseString = new String(urlBytes); System.out.println( The response string for +urlToDownload.toString()+ is+responseString); } catch (IOException e) { System.err.printf(Failed while reading bytes from %s: %s, urlToDownload.toExternalForm(), e.getMessage()); e.printStackTrace(); // Perform any other exception handling that's appropriate. } finally { if (iStream != null) { try { iStream.close(); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } } } return urlBytes; }
Re: Error trying to download a Gzip file.
Stephen, Thanks, I now checked the reason phrase and it returns Not Found when I get a non gzip page not found error and an OK when I successfully get a gzip file download. As I said earlier in multiple iterations , the first one is successful almost all the time and any subsequent request after that can fail. Is there any way I can stick to the fist successful server instance using the HttpClient , is there any way around ? Thanks for your help. On Mon, Apr 1, 2013 at 11:48 PM, Stephen J. Butler stephen.but...@gmail.com wrote: I notice you aren't checking your status code. Error response have a body/response entity too. Take a look at response.getStatusLine().getStatusCode() and getStatusReasonPhrase(). I bet the server is limiting you in the instances where you're seeing non-gziped content. On Mon, Apr 1, 2013 at 10:42 PM, S.L simpleliving...@gmail.com wrote: Hi All, I use the following code and run the method multiple times, a few times I get a response in gzip which is what I expect and a few time I get a response that is completely different(non Gzip and html format) .However if I download the same URL multiple times using Mozilla or IE I consistently get the same GZIP response , Is this an error with the server I am trying to reach to , or do I need to set parameters to get a consistent response ? Thanks. The URL I am trying to download is * http://www.walmart.com/navigation6.xml.gz* , can you please let me know ? Thanks public static byte[] dowloadURL(URL urlToDownload) { InputStream iStream = null; byte[] urlBytes = null; try { //HttpClient httpClient = new HttpClient(); org.apache.http.client. HttpClient httpClient = new DefaultHttpClient(); HttpGet httpget = new HttpGet(urlToDownload.toString()); HttpResponse response = httpClient.execute(httpget); iStream = response.getEntity().getContent(); urlBytes = IOUtils.toByteArray(iStream); String responseString = new String(urlBytes); System.out.println( The response string for +urlToDownload.toString()+ is+responseString); } catch (IOException e) { System.err.printf(Failed while reading bytes from %s: %s, urlToDownload.toExternalForm(), e.getMessage()); e.printStackTrace(); // Perform any other exception handling that's appropriate. } finally { if (iStream != null) { try { iStream.close(); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } } } return urlBytes; }
Re: Error trying to download a Gzip file.
This really doesn't sound like a problem with HttpClient. Rather, an issue with Walmart's servers. Maybe you're querying the server too often and they're rate limiting you. Or if the same pattern always works in a browser then they're sniffing your UserAgent and doing something different. In any event, if this is an approved API for Walmart.com, you need to contact them. Or just cache the response yourself in the code and use a cached value when you get a 404. There's nothing wrong with HttpClient. It's doing exactly what the server told it to do. On Tue, Apr 2, 2013 at 3:12 PM, S.L simpleliving...@gmail.com wrote: Stephen, Thanks, I now checked the reason phrase and it returns Not Found when I get a non gzip page not found error and an OK when I successfully get a gzip file download. As I said earlier in multiple iterations , the first one is successful almost all the time and any subsequent request after that can fail. Is there any way I can stick to the fist successful server instance using the HttpClient , is there any way around ? Thanks for your help. On Mon, Apr 1, 2013 at 11:48 PM, Stephen J. Butler stephen.but...@gmail.com wrote: I notice you aren't checking your status code. Error response have a body/response entity too. Take a look at response.getStatusLine().getStatusCode() and getStatusReasonPhrase(). I bet the server is limiting you in the instances where you're seeing non-gziped content. On Mon, Apr 1, 2013 at 10:42 PM, S.L simpleliving...@gmail.com wrote: Hi All, I use the following code and run the method multiple times, a few times I get a response in gzip which is what I expect and a few time I get a response that is completely different(non Gzip and html format) .However if I download the same URL multiple times using Mozilla or IE I consistently get the same GZIP response , Is this an error with the server I am trying to reach to , or do I need to set parameters to get a consistent response ? Thanks. The URL I am trying to download is * http://www.walmart.com/navigation6.xml.gz* , can you please let me know ? Thanks public static byte[] dowloadURL(URL urlToDownload) { InputStream iStream = null; byte[] urlBytes = null; try { //HttpClient httpClient = new HttpClient(); org.apache.http.client. HttpClient httpClient = new DefaultHttpClient(); HttpGet httpget = new HttpGet(urlToDownload.toString()); HttpResponse response = httpClient.execute(httpget); iStream = response.getEntity().getContent(); urlBytes = IOUtils.toByteArray(iStream); String responseString = new String(urlBytes); System.out.println( The response string for +urlToDownload.toString()+ is+responseString); } catch (IOException e) { System.err.printf(Failed while reading bytes from %s: %s, urlToDownload.toExternalForm(), e.getMessage()); e.printStackTrace(); // Perform any other exception handling that's appropriate. } finally { if (iStream != null) { try { iStream.close(); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } } } return urlBytes; }
Re: Error trying to download a Gzip file.
Thanks Stephen .I completely agree with you , I am just accessing publicly available data , one thing that I would like to mention though is that I get the following warning every time this happens . *Invalid cookie header: Set-Cookie: . Cookie name may not be empty*. Even though I have set a lenient cookie -policy as illustrated here in section 3.5 . http://hc.apache.org/httpcomponents-client-ga/tutorial/html/statemgmt.html#d4e777 On Tue, Apr 2, 2013 at 4:19 PM, Stephen J. Butler stephen.but...@gmail.comwrote: This really doesn't sound like a problem with HttpClient. Rather, an issue with Walmart's servers. Maybe you're querying the server too often and they're rate limiting you. Or if the same pattern always works in a browser then they're sniffing your UserAgent and doing something different. In any event, if this is an approved API for Walmart.com, you need to contact them. Or just cache the response yourself in the code and use a cached value when you get a 404. There's nothing wrong with HttpClient. It's doing exactly what the server told it to do. On Tue, Apr 2, 2013 at 3:12 PM, S.L simpleliving...@gmail.com wrote: Stephen, Thanks, I now checked the reason phrase and it returns Not Found when I get a non gzip page not found error and an OK when I successfully get a gzip file download. As I said earlier in multiple iterations , the first one is successful almost all the time and any subsequent request after that can fail. Is there any way I can stick to the fist successful server instance using the HttpClient , is there any way around ? Thanks for your help. On Mon, Apr 1, 2013 at 11:48 PM, Stephen J. Butler stephen.but...@gmail.com wrote: I notice you aren't checking your status code. Error response have a body/response entity too. Take a look at response.getStatusLine().getStatusCode() and getStatusReasonPhrase(). I bet the server is limiting you in the instances where you're seeing non-gziped content. On Mon, Apr 1, 2013 at 10:42 PM, S.L simpleliving...@gmail.com wrote: Hi All, I use the following code and run the method multiple times, a few times I get a response in gzip which is what I expect and a few time I get a response that is completely different(non Gzip and html format) .However if I download the same URL multiple times using Mozilla or IE I consistently get the same GZIP response , Is this an error with the server I am trying to reach to , or do I need to set parameters to get a consistent response ? Thanks. The URL I am trying to download is * http://www.walmart.com/navigation6.xml.gz* , can you please let me know ? Thanks public static byte[] dowloadURL(URL urlToDownload) { InputStream iStream = null; byte[] urlBytes = null; try { //HttpClient httpClient = new HttpClient(); org.apache.http.client. HttpClient httpClient = new DefaultHttpClient(); HttpGet httpget = new HttpGet(urlToDownload.toString()); HttpResponse response = httpClient.execute(httpget); iStream = response.getEntity().getContent(); urlBytes = IOUtils.toByteArray(iStream); String responseString = new String(urlBytes); System.out.println( The response string for +urlToDownload.toString()+ is+responseString); } catch (IOException e) { System.err.printf(Failed while reading bytes from %s: %s, urlToDownload.toExternalForm(), e.getMessage()); e.printStackTrace(); // Perform any other exception handling that's appropriate. } finally { if (iStream != null) { try { iStream.close(); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } } } return urlBytes; }
Re: Error trying to download a Gzip file.
I notice you aren't checking your status code. Error response have a body/response entity too. Take a look at response.getStatusLine().getStatusCode() and getStatusReasonPhrase(). I bet the server is limiting you in the instances where you're seeing non-gziped content. On Mon, Apr 1, 2013 at 10:42 PM, S.L simpleliving...@gmail.com wrote: Hi All, I use the following code and run the method multiple times, a few times I get a response in gzip which is what I expect and a few time I get a response that is completely different(non Gzip and html format) .However if I download the same URL multiple times using Mozilla or IE I consistently get the same GZIP response , Is this an error with the server I am trying to reach to , or do I need to set parameters to get a consistent response ? Thanks. The URL I am trying to download is * http://www.walmart.com/navigation6.xml.gz* , can you please let me know ? Thanks public static byte[] dowloadURL(URL urlToDownload) { InputStream iStream = null; byte[] urlBytes = null; try { //HttpClient httpClient = new HttpClient(); org.apache.http.client. HttpClient httpClient = new DefaultHttpClient(); HttpGet httpget = new HttpGet(urlToDownload.toString()); HttpResponse response = httpClient.execute(httpget); iStream = response.getEntity().getContent(); urlBytes = IOUtils.toByteArray(iStream); String responseString = new String(urlBytes); System.out.println( The response string for +urlToDownload.toString()+ is+responseString); } catch (IOException e) { System.err.printf(Failed while reading bytes from %s: %s, urlToDownload.toExternalForm(), e.getMessage()); e.printStackTrace(); // Perform any other exception handling that's appropriate. } finally { if (iStream != null) { try { iStream.close(); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } } } return urlBytes; }