Re: [R] How to download this data

2015-08-25 Thread Bert Gunter
This is not a simple question. The data are in an html-formatted web
page. You must scrape the html for the data and read it into an R
table (or other appropriate R data structure). Searching (the web) on
scrape data from html into R  listed several packages that claim to
enable you to do this easily. Choose what seems best for you.

You should also install and read the documentation for the XML
package, which is also used for this purpose, though those you find
above may be slicker.

Disclaimer: I have no direct experience with this. I'm just pointing
out what I believe are relevant resources.

Cheers,
Bert
Bert Gunter

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
   -- Clifford Stoll


On Tue, Aug 25, 2015 at 11:10 AM, Christofer Bogaso
bogaso.christo...@gmail.com wrote:
 Hi,

 I would like to download data from below page directly onto R.

 http://www.nseindia.com/live_market/dynaContent/live_watch/equities_stock_watch.htm

 Could you please assist me how can I do that programmatically.

 Thanks for your time.

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to download this data

2015-08-25 Thread boB Rudis
Looks like you can get what you need from
http://www.nseindia.com/homepage/Indices1.json on that page.

On Tue, Aug 25, 2015 at 2:23 PM, Bert Gunter bgunter.4...@gmail.com wrote:
 This is not a simple question. The data are in an html-formatted web
 page. You must scrape the html for the data and read it into an R
 table (or other appropriate R data structure). Searching (the web) on
 scrape data from html into R  listed several packages that claim to
 enable you to do this easily. Choose what seems best for you.

 You should also install and read the documentation for the XML
 package, which is also used for this purpose, though those you find
 above may be slicker.

 Disclaimer: I have no direct experience with this. I'm just pointing
 out what I believe are relevant resources.

 Cheers,
 Bert
 Bert Gunter

 Data is not information. Information is not knowledge. And knowledge
 is certainly not wisdom.
-- Clifford Stoll


 On Tue, Aug 25, 2015 at 11:10 AM, Christofer Bogaso
 bogaso.christo...@gmail.com wrote:
 Hi,

 I would like to download data from below page directly onto R.

 http://www.nseindia.com/live_market/dynaContent/live_watch/equities_stock_watch.htm

 Could you please assist me how can I do that programmatically.

 Thanks for your time.

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to download this data

2015-08-25 Thread Hasan Diwan
If there's no api available, I would use selenium to grab what I need and
pipe it to R. Let me know if you need further assistance. Cheers! -- H
On Aug 25, 2015 11:12 AM, Christofer Bogaso bogaso.christo...@gmail.com
wrote:

 Hi,

 I would like to download data from below page directly onto R.


 http://www.nseindia.com/live_market/dynaContent/live_watch/equities_stock_watch.htm

 Could you please assist me how can I do that programmatically.

 Thanks for your time.

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to download this data

2015-08-25 Thread Christofer Bogaso
Hi,

I would like to download data from below page directly onto R.

http://www.nseindia.com/live_market/dynaContent/live_watch/equities_stock_watch.htm

Could you please assist me how can I do that programmatically.

Thanks for your time.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to download this data

2015-08-25 Thread Jeff Newmiller
I agree that this is a tricky task... even more so than using ascraping 
package because the page is built dynamically. This will take someone with 
skills in multiple web technologies to decipher the web page scripts to figure 
out how to manipulate the server to give you the data, because it isn't 
actually in the web page.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On August 25, 2015 11:23:26 AM PDT, Bert Gunter bgunter.4...@gmail.com wrote:
This is not a simple question. The data are in an html-formatted web
page. You must scrape the html for the data and read it into an R
table (or other appropriate R data structure). Searching (the web) on
scrape data from html into R  listed several packages that claim to
enable you to do this easily. Choose what seems best for you.

You should also install and read the documentation for the XML
package, which is also used for this purpose, though those you find
above may be slicker.

Disclaimer: I have no direct experience with this. I'm just pointing
out what I believe are relevant resources.

Cheers,
Bert
Bert Gunter

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
   -- Clifford Stoll


On Tue, Aug 25, 2015 at 11:10 AM, Christofer Bogaso
bogaso.christo...@gmail.com wrote:
 Hi,

 I would like to download data from below page directly onto R.


http://www.nseindia.com/live_market/dynaContent/live_watch/equities_stock_watch.htm

 Could you please assist me how can I do that programmatically.

 Thanks for your time.

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to download this data

2015-08-25 Thread Bert Gunter
Actually, in looking again, I noticed a download in csv link on the
page, and this appears to provide a csv -formatted table that then can
trivially be read into R by, e.g. read.csv() .

So maybe all the html (or JSON) stuff can be ignored.


-- Bert
Bert Gunter

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
   -- Clifford Stoll


On Tue, Aug 25, 2015 at 11:59 AM, Jeff Newmiller
jdnew...@dcn.davis.ca.us wrote:
 I agree that this is a tricky task... even more so than using ascraping 
 package because the page is built dynamically. This will take someone with 
 skills in multiple web technologies to decipher the web page scripts to 
 figure out how to manipulate the server to give you the data, because it 
 isn't actually in the web page.
 ---
 Jeff NewmillerThe .   .  Go Live...
 DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
   Live:   OO#.. Dead: OO#..  Playing
 Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
 /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
 ---
 Sent from my phone. Please excuse my brevity.

 On August 25, 2015 11:23:26 AM PDT, Bert Gunter bgunter.4...@gmail.com 
 wrote:
This is not a simple question. The data are in an html-formatted web
page. You must scrape the html for the data and read it into an R
table (or other appropriate R data structure). Searching (the web) on
scrape data from html into R  listed several packages that claim to
enable you to do this easily. Choose what seems best for you.

You should also install and read the documentation for the XML
package, which is also used for this purpose, though those you find
above may be slicker.

Disclaimer: I have no direct experience with this. I'm just pointing
out what I believe are relevant resources.

Cheers,
Bert
Bert Gunter

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
   -- Clifford Stoll


On Tue, Aug 25, 2015 at 11:10 AM, Christofer Bogaso
bogaso.christo...@gmail.com wrote:
 Hi,

 I would like to download data from below page directly onto R.


http://www.nseindia.com/live_market/dynaContent/live_watch/equities_stock_watch.htm

 Could you please assist me how can I do that programmatically.

 Thanks for your time.

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to download this data

2015-08-25 Thread Rui Barradas

Hello,

There might be a problem:

 url - 
http://www.nseindia.com/live_market/dynaContent/live_watch/equities_stock_watch.htm;

 readLines(url)
Error in file(con, r) : cannot open the connection
In addition: Warning message:
In file(con, r) : cannot open: HTTP status was '403 Forbidden'


So I've downloaded the csv file with the data, but that's not 
programmatically.


Rui Barradas

Em 25-08-2015 19:10, Christofer Bogaso escreveu:

Hi,

I would like to download data from below page directly onto R.

http://www.nseindia.com/live_market/dynaContent/live_watch/equities_stock_watch.htm

Could you please assist me how can I do that programmatically.

Thanks for your time.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to download this data

2015-08-25 Thread John McKown
FWIW. This violates their terms of service, unless you have their
permission:

http://www.nseindia.com/global/content/termsofuse.htm

quote

You may not conduct any systematic or automated data collection activities
(including scraping, data mining, data extraction and data harvesting) on
or in relation to our website without our express written consent.

quote/

On Tue, Aug 25, 2015 at 1:10 PM, Christofer Bogaso 
bogaso.christo...@gmail.com wrote:

 Hi,

 I would like to download data from below page directly onto R.


 http://www.nseindia.com/live_market/dynaContent/live_watch/equities_stock_watch.htm

 Could you please assist me how can I do that programmatically.

 Thanks for your time.

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 

Schrodinger's backup: The condition of any backup is unknown until a
restore is attempted.

Yoda of Borg, we are. Futile, resistance is, yes. Assimilated, you will be.

He's about as useful as a wax frying pan.

10 to the 12th power microphones = 1 Megaphone

Maranatha! 
John McKown

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to download this data

2015-08-25 Thread Jeff Newmiller
... but not programmatically.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On August 25, 2015 12:11:15 PM PDT, Bert Gunter bgunter.4...@gmail.com wrote:
Actually, in looking again, I noticed a download in csv link on the
page, and this appears to provide a csv -formatted table that then can
trivially be read into R by, e.g. read.csv() .

So maybe all the html (or JSON) stuff can be ignored.


-- Bert
Bert Gunter

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
   -- Clifford Stoll


On Tue, Aug 25, 2015 at 11:59 AM, Jeff Newmiller
jdnew...@dcn.davis.ca.us wrote:
 I agree that this is a tricky task... even more so than using
ascraping package because the page is built dynamically. This will
take someone with skills in multiple web technologies to decipher the
web page scripts to figure out how to manipulate the server to give you
the data, because it isn't actually in the web page.

---
 Jeff NewmillerThe .   .  Go
Live...
 DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live
Go...
   Live:   OO#.. Dead: OO#.. 
Playing
 Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
 /Software/Embedded Controllers)   .OO#.   .OO#. 
rocks...1k

---
 Sent from my phone. Please excuse my brevity.

 On August 25, 2015 11:23:26 AM PDT, Bert Gunter
bgunter.4...@gmail.com wrote:
This is not a simple question. The data are in an html-formatted web
page. You must scrape the html for the data and read it into an R
table (or other appropriate R data structure). Searching (the web) on
scrape data from html into R  listed several packages that claim to
enable you to do this easily. Choose what seems best for you.

You should also install and read the documentation for the XML
package, which is also used for this purpose, though those you find
above may be slicker.

Disclaimer: I have no direct experience with this. I'm just pointing
out what I believe are relevant resources.

Cheers,
Bert
Bert Gunter

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
   -- Clifford Stoll


On Tue, Aug 25, 2015 at 11:10 AM, Christofer Bogaso
bogaso.christo...@gmail.com wrote:
 Hi,

 I would like to download data from below page directly onto R.


http://www.nseindia.com/live_market/dynaContent/live_watch/equities_stock_watch.htm

 Could you please assist me how can I do that programmatically.

 Thanks for your time.

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] HOW TO DOWNLOAD INTRADAY DATA AT ONE TIME

2015-01-15 Thread Amatoallah Ouchen
Is there any way VIA R to download all available intraday data for stocks
at once (for example, all the data available at the Indian stock exchange)?
I need to make a comparative analysis and downloading the data by ticker is
too time consuming, besides I want to know if there is any website that
store the historical intraday data. Other sites delete the data gradually.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to download this data?

2013-08-03 Thread Ron Michael
Hello Duncan,
 
Thank you very much for your pointer.
 
However when I tried to run your code, I got following error:
  rawOrig = 
getURLContent(https://www.theice.com/productguide/ProductSpec.shtml?specId=219#expiry;)
 
Error in function (type, msg, asError = TRUE)  : 
  SSL certificate problem, verify that the CA cert is OK. Details:
error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify 
failed

Can someone help me to understand what could be the cause of this error?
 
Thank you.


- Original Message -
From: Duncan Temple Lang dtemplel...@ucdavis.edu
To: r-help@r-project.org
Cc: 
Sent: Saturday, 3 August 2013 4:33 AM
Subject: Re: [R] How to download this data?


That URL is an HTTPS (secure HTTP), not an HTTP.
The XML parser cannot retrieve the file.
Instead, use the RCurl package to get the file.

However, it is more complicated than that. If
you look at source of the HTML page in a browser,
you'll see a jsessionid and that is a session identifier.

The following retrieves the content of your URL and then
parses it and extracts the value of the jsessionid.
Then we create the full URL to the actual data page (which is actually in the 
HTML
content but in JavaScript code)

library(RCurl)
library(XML)

rawOrig = 
getURLContent(https://www.theice.com/productguide/ProductSpec.shtml?specId=219#expiry;)
rawDoc = htmlParse(rawOrig)
tmp = getNodeSet(rawDoc, //@href[contains(.,\040'jsessionid=')])[[1]]
jsession = gsub(.*jsessionid=([^?]+)?.*, \\1, tmp)

u = 
sprintf(https://www.theice.com/productguide/ProductSpec.shtml;jsessionid=%s?expiryDates=specId=219;,
 jsession)

doc = htmlParse(getURLContent(u))
tbls = readHTMLTable(doc)
data = tbls[[1]]

dim(data)


I did this quickly so it may not be the best way or completely robust, but 
hopefully
it gets the point across and does get the data.

  D.

On 8/2/13 2:42 PM, Ron Michael wrote:
 Hi all,
  
 I need to download the data from this web page:
  
 https://www.theice.com/productguide/ProductSpec.shtml?specId=219#expiry
  
 I used the function readHTMLTable() from package XML, however could not 
 download that.
  
 Can somebody help me how to get the data onto my R window?
  
 Thank you.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to download this data?

2013-08-03 Thread Ron Michael
In the mean time I have this problem sorted out, hopefully I did it correctly. 
I have modified the line of your code as:
 
rawOrig = 
getURLContent(https://www.theice.com/productguide/ProductSpec.shtml?specId=219#expiry;,
 ssl.verifypeer = FALSE)
 
However next I faced with another problem to executing:
  u = sprintf(a 
href=https://www.theice.com/productguide/ProductSpec.shtml;jsessionid=%s?expiryDates=specId=219;https://www.theice.com/productguide/ProductSpec.shtml;jsessionid=%s?expiryDates=specId=219;,
 jsession) 
Error: unexpected symbol in u = sprintf(a href=https

Can you or someone else help me to get out of this error?
 
Also, my another question is: from where you got the expression:
a 
href=https://www.theice.com/productguide/ProductSpec.shtml;jsessionid=%s?expiryDates=specId=219;https://www.theice.com/productguide/ProductSpec.shtml;jsessionid=%s?expiryDates=specId=219;
 
I really appreciate if someone help me to understand that.
 
Thank you.


- Original Message -
From: Ron Michael ron_michae...@yahoo.com
To: Duncan Temple Lang dtemplel...@ucdavis.edu; r-help@r-project.org 
r-help@r-project.org
Cc: 
Sent: Saturday, 3 August 2013 12:58 PM
Subject: Re: [R] How to download this data?

Hello Duncan,
 
Thank you very much for your pointer.
 
However when I tried to run your code, I got following error:
  rawOrig = 
getURLContent(https://www.theice.com/productguide/ProductSpec.shtml?specId=219#expiry;)
 
Error in function (type, msg, asError = TRUE)  : 
  SSL certificate problem, verify that the CA cert is OK. Details:
error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify 
failed

Can someone help me to understand what could be the cause of this error?
 
Thank you.


- Original Message -
From: Duncan Temple Lang dtemplel...@ucdavis.edu
To: r-help@r-project.org
Cc: 
Sent: Saturday, 3 August 2013 4:33 AM
Subject: Re: [R] How to download this data?


That URL is an HTTPS (secure HTTP), not an HTTP.
The XML parser cannot retrieve the file.
Instead, use the RCurl package to get the file.

However, it is more complicated than that. If
you look at source of the HTML page in a browser,
you'll see a jsessionid and that is a session identifier.

The following retrieves the content of your URL and then
parses it and extracts the value of the jsessionid.
Then we create the full URL to the actual data page (which is actually in the 
HTML
content but in JavaScript code)

library(RCurl)
library(XML)

rawOrig = 
getURLContent(https://www.theice.com/productguide/ProductSpec.shtml?specId=219#expiry;)
rawDoc = htmlParse(rawOrig)
tmp = getNodeSet(rawDoc, //@href[contains(.,\040'jsessionid=')])[[1]]
jsession = gsub(.*jsessionid=([^?]+)?.*, \\1, tmp)

u = 
sprintf(https://www.theice.com/productguide/ProductSpec.shtml;jsessionid=%s?expiryDates=specId=219;,
 jsession)

doc = htmlParse(getURLContent(u))
tbls = readHTMLTable(doc)
data = tbls[[1]]

dim(data)


I did this quickly so it may not be the best way or completely robust, but 
hopefully
it gets the point across and does get the data.

  D.

On 8/2/13 2:42 PM, Ron Michael wrote:
 Hi all,
  
 I need to download the data from this web page:
  
 https://www.theice.com/productguide/ProductSpec.shtml?specId=219#expiry
  
 I used the function readHTMLTable() from package XML, however could not 
 download that.
  
 Can somebody help me how to get the data onto my R window?
  
 Thank you.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to download this data?

2013-08-03 Thread Duncan Temple Lang
Hi Ron

  Yes, you can use ssl.verifypeer = FALSE.  Or alternatively, you can use also 
use

   getURLContent(,  cainfo = system.file(CurlSSL, cacert.pem, 
package = RCurl))

 to specify where libcurl can find the certificates to verify the SSL signature.


 The error you are encountering appears to becoming from a garbled R 
expression. This may have
arisen as a result of an HTML mailer adding the a href= into the 
expression
where it found an https://...

 What we want to do is end up with a string of the form

   
https://www.theice.com/productguide/ProductSpec.shtml;jsessionid=adasdasdad?expiryData=specId=219

We have to substitute the text adasdasdad which  we assigned to jsession in a 
previous command.
So, take the literal text

   c(https://www.theice.com/productguide/ProductSpec.shtml;jsessionid=;,
 jsession,
 ?expiryData=specId=219)

and combine it into a single string with paste0.

We need the literal strings as they appear when you view the mail for R to make 
sense of them, not what the mailer adds.


As to where I found this, it is in the source of the original HTML page in 
rawDoc

 scripts = getNodeSet(rawDoc, //body//script)
 scripts[[ length(scripts) ]]

and look at the text, specifically the app.urls and its 'expiry' field.


script type=text/javascript![CDATA[

var app = {};

app.isOption = false;

app.urls = {


'spec':'/productguide/ProductSpec.shtml;jsessionid=22E9BE9DB19FC6F3446C9ED4AFF2BE3F?details=specId=219',


'data':'/productguide/ProductSpec.shtml;jsessionid=22E9BE9DB19FC6F3446C9ED4AFF2BE3F?data=specId=219',


'confirm':'/reports/dealreports/getSampleConfirm.do;jsessionid=22E9BE9DB19FC6F3446C9ED4AFF2BE3F?hubId=403productId=254',


'reports':'/productguide/ProductSpec.shtml;jsessionid=22E9BE9DB19FC6F3446C9ED4AFF2BE3F?reports=specId=219',


'expiry':'/productguide/ProductSpec.shtml;jsessionid=22E9BE9DB19FC6F3446C9ED4AFF2BE3F?expiryDates=specId=219'

};

app.Router = Backbone.Router.extend({

routes:{

spec:spec,

data:data,

confirm:confirm,


On 8/3/13 1:05 AM, Ron Michael wrote:
 In the mean time I have this problem sorted out, hopefully I did it 
 correctly. I have modified the line of your code as:
  
 rawOrig = 
 getURLContent(https://www.theice.com/productguide/ProductSpec.shtml?specId=219#expiry;,
  ssl.verifypeer = FALSE)
  
 However next I faced with another problem to executing:
   u = sprintf(a 
 href=https://www.theice.com/productguide/ProductSpec.shtml;jsessionid=%s?expiryDates=specId=219;https://www.theice.com/productguide/ProductSpec.shtml;jsessionid=%s?expiryDates=specId=219;,
  jsession) 
 Error: unexpected symbol in u = sprintf(a href=https
 
 Can you or someone else help me to get out of this error?
  
 Also, my another question is: from where you got the expression:
 a 
 href=https://www.theice.com/productguide/ProductSpec.shtml;jsessionid=%s?expiryDates=specId=219;https://www.theice.com/productguide/ProductSpec.shtml;jsessionid=%s?expiryDates=specId=219;
  
 I really appreciate if someone help me to understand that.
  
 Thank you.
 
 
 - Original Message -
 From: Ron Michael ron_michae...@yahoo.com
 To: Duncan Temple Lang dtemplel...@ucdavis.edu; r-help@r-project.org 
 r-help@r-project.org
 Cc: 
 Sent: Saturday, 3 August 2013 12:58 PM
 Subject: Re: [R] How to download this data?
 
 Hello Duncan,
  
 Thank you very much for your pointer.
  
 However when I tried to run your code, I got following error:
   rawOrig = 
 getURLContent(https://www.theice.com/productguide/ProductSpec.shtml?specId=219#expiry;)
  
 Error in function (type, msg, asError = TRUE)  : 
   SSL certificate problem, verify that the CA cert is OK. Details:
 error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify 
 failed
 
 Can someone help me to understand what could be the cause of this error?
  
 Thank you.
 
 
 - Original Message -
 From: Duncan Temple Lang dtemplel...@ucdavis.edu
 To: r-help@r-project.org
 Cc: 
 Sent: Saturday, 3 August 2013 4:33 AM
 Subject: Re: [R] How to download this data?
 
 
 That URL is an HTTPS (secure HTTP), not an HTTP.
 The XML parser cannot retrieve the file.
 Instead, use the RCurl package to get the file.
 
 However, it is more complicated than that. If
 you look at source of the HTML page in a browser,
 you'll see a jsessionid and that is a session identifier.
 
 The following retrieves the content of your URL and then
 parses it and extracts the value of the jsessionid.
 Then we create the full URL to the actual data page (which is actually in the 
 HTML
 content but in JavaScript code)
 
 library(RCurl)
 library(XML)
 
 rawOrig = 
 getURLContent(https://www.theice.com/productguide/ProductSpec.shtml?specId=219#expiry;)
 rawDoc = htmlParse(rawOrig)
 tmp = getNodeSet(rawDoc, //@href[contains(.,\040'jsessionid=')])[[1]]
 jsession = gsub(.*jsessionid=([^?]+)?.*, \\1

Re: [R] How to download this data?

2013-08-03 Thread Ron Michael
Hi Duncan,

Thank you very much for your prompt help. Now all worked very smoothly.

Thank you.


- Original Message -
From: Duncan Temple Lang dtemplel...@ucdavis.edu
To: Ron Michael ron_michae...@yahoo.com
Cc: r-help@r-project.org r-help@r-project.org
Sent: Saturday, 3 August 2013 7:43 PM
Subject: Re: [R] How to download this data?

Hi Ron

  Yes, you can use ssl.verifypeer = FALSE.  Or alternatively, you can use also 
use

  getURLContent(,  cainfo = system.file(CurlSSL, cacert.pem, 
package = RCurl))

to specify where libcurl can find the certificates to verify the SSL signature.


The error you are encountering appears to becoming from a garbled R expression. 
This may have
arisen as a result of an HTML mailer adding the a href= into the 
expression
where it found an https://...

What we want to do is end up with a string of the form

  
https://www.theice.com/productguide/ProductSpec.shtml;jsessionid=adasdasdad?expiryData=specId=219

We have to substitute the text adasdasdad which  we assigned to jsession in a 
previous command.
So, take the literal text

  c(https://www.theice.com/productguide/ProductSpec.shtml;jsessionid=;,
    jsession,
    ?expiryData=specId=219)

and combine it into a single string with paste0.

We need the literal strings as they appear when you view the mail for R to make 
sense of them, not what the mailer adds.


As to where I found this, it is in the source of the original HTML page in 
rawDoc

scripts = getNodeSet(rawDoc, //body//script)
scripts[[ length(scripts) ]]

and look at the text, specifically the app.urls and its 'expiry' field.


script type=text/javascript![CDATA[

        var app = {};

        app.isOption = false;

        app.urls = {

            
'spec':'/productguide/ProductSpec.shtml;jsessionid=22E9BE9DB19FC6F3446C9ED4AFF2BE3F?details=specId=219',

            
'data':'/productguide/ProductSpec.shtml;jsessionid=22E9BE9DB19FC6F3446C9ED4AFF2BE3F?data=specId=219',


'confirm':'/reports/dealreports/getSampleConfirm.do;jsessionid=22E9BE9DB19FC6F3446C9ED4AFF2BE3F?hubId=403productId=254',

            
'reports':'/productguide/ProductSpec.shtml;jsessionid=22E9BE9DB19FC6F3446C9ED4AFF2BE3F?reports=specId=219',


'expiry':'/productguide/ProductSpec.shtml;jsessionid=22E9BE9DB19FC6F3446C9ED4AFF2BE3F?expiryDates=specId=219'

        };

        app.Router = Backbone.Router.extend({

            routes:{

                spec:spec,

                data:data,

                confirm:confirm,


On 8/3/13 1:05 AM, Ron Michael wrote:
 In the mean time I have this problem sorted out, hopefully I did it 
 correctly. I have modified the line of your code as:
  
 rawOrig = 
 getURLContent(https://www.theice.com/productguide/ProductSpec.shtml?specId=219#expiry;,
  ssl.verifypeer = FALSE)
  
 However next I faced with another problem to executing:
   u = sprintf(a 
href=https://www.theice.com/productguide/ProductSpec.shtml;jsessionid=%s?expiryDates=specId=219;https://www.theice.com/productguide/ProductSpec.shtml;jsessionid=%s?expiryDates=specId=219;,
 jsession) 
 Error: unexpected symbol in u = sprintf(a href=https
 
 Can you or someone else help me to get out of this error?
  
 Also, my another question is: from where you got the expression:
 a 
 href=https://www.theice.com/productguide/ProductSpec.shtml;jsessionid=%s?expiryDates=specId=219;https://www.theice.com/productguide/ProductSpec.shtml;jsessionid=%s?expiryDates=specId=219;
  
 I really appreciate if someone help me to understand that.
  
 Thank you.
 
 
 - Original Message -
 From: Ron Michael ron_michae...@yahoo.com
 To: Duncan Temple Lang dtemplel...@ucdavis.edu; r-help@r-project.org 
 r-help@r-project.org
 Cc: 
 Sent: Saturday, 3 August 2013 12:58 PM
 Subject: Re: [R] How to download this data?
 
 Hello Duncan,
  
 Thank you very much for your pointer.
  
 However when I tried to run your code, I got following error:
   rawOrig = 
getURLContent(https://www.theice.com/productguide/ProductSpec.shtml?specId=219#expiry;)
 
 Error in function (type, msg, asError = TRUE)  : 
  SSL certificate problem, verify that the CA cert is OK. Details:
 error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify 
 failed
 
 Can someone help me to understand what could be the cause of this error?
  
 Thank you.
 
 
 - Original Message -
 From: Duncan Temple Lang dtemplel...@ucdavis.edu
 To: r-help@r-project.org
 Cc: 
 Sent: Saturday, 3 August 2013 4:33 AM
 Subject: Re: [R] How to download this data?
 
 
 That URL is an HTTPS (secure HTTP), not an HTTP.
 The XML parser cannot retrieve the file.
 Instead, use the RCurl package to get the file.
 
 However, it is more complicated than that. If
 you look at source of the HTML page in a browser,
 you'll see a jsessionid and that is a session identifier.
 
 The following retrieves the content of your URL and then
 parses it and extracts the value of the jsessionid.
 Then we create the full URL to the actual data page (which

[R] How to download this data?

2013-08-02 Thread Ron Michael
Hi all,
 
I need to download the data from this web page:
 
https://www.theice.com/productguide/ProductSpec.shtml?specId=219#expiry
 
I used the function readHTMLTable() from package XML, however could not 
download that.
 
Can somebody help me how to get the data onto my R window?
 
Thank you.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to download this data?

2013-08-02 Thread Duncan Temple Lang

That URL is an HTTPS (secure HTTP), not an HTTP.
The XML parser cannot retrieve the file.
Instead, use the RCurl package to get the file.

However, it is more complicated than that. If
you look at source of the HTML page in a browser,
you'll see a jsessionid and that is a session identifier.

The following retrieves the content of your URL and then
parses it and extracts the value of the jsessionid.
Then we create the full URL to the actual data page (which is actually in the 
HTML
content but in JavaScript code)

library(RCurl)
library(XML)

rawOrig = 
getURLContent(https://www.theice.com/productguide/ProductSpec.shtml?specId=219#expiry;)
rawDoc = htmlParse(rawOrig)
tmp = getNodeSet(rawDoc, //@href[contains(.,\040'jsessionid=')])[[1]]
jsession = gsub(.*jsessionid=([^?]+)?.*, \\1, tmp)

u = 
sprintf(https://www.theice.com/productguide/ProductSpec.shtml;jsessionid=%s?expiryDates=specId=219;,
 jsession)

doc = htmlParse(getURLContent(u))
tbls = readHTMLTable(doc)
data = tbls[[1]]

dim(data)


I did this quickly so it may not be the best way or completely robust, but 
hopefully
it gets the point across and does get the data.

  D.

On 8/2/13 2:42 PM, Ron Michael wrote:
 Hi all,
  
 I need to download the data from this web page:
  
 https://www.theice.com/productguide/ProductSpec.shtml?specId=219#expiry
  
 I used the function readHTMLTable() from package XML, however could not 
 download that.
  
 Can somebody help me how to get the data onto my R window?
  
 Thank you.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.