Re: Financial time series data
On 03-Sep-10 7:29 AM, Virgil Stokes wrote: A more direct question on accessing stock information from Yahoo. First, use your browser to go to: http://finance.yahoo.com/q/cp?s=%5EGSPC+Components Now, you see the first 50 rows of a 500 row table of information on SP 500 index. You can LM click on 1 -50 of 500 |First|Previous|Next|Last below the table to position to any of the 10 pages. I would like to use Python to do the following. *Loop on each of the 10 pages and for each page extract information for each row --- How can this be accomplished automatically in Python?* Let's take the first page (as shown by default). It is easy to see the link to the data for A is http://finance.yahoo.com/q?s=A. That is, I can just move my cursor over the A and I see this URL in the message at the bottom of my browser (Explorer 8). If I LM click on A then I will go to this link --- Do this! You should now see a table which shows information on this stock and *this is the information that I would like to extract*. I would like to do this for all 500 stocks without the need to enter the symbols for them (e.g. A, AA, etc.). It seems clear that this should be possible since all the symbols are in the first column of each of the 50 tables --- but it is not at all clear how to extract these automatically in Python. Hopefully, you understand my problem. Again, I would like Python to cycle through these 10 pages and extract this information for each symbol in this table. --V You want the 'get_historical_prices' method of the (beautifully elegant) 'ystockquote.py': http://www.goldb.org/ystockquote.html. Just specify start date and end date and wallah, you get an array of historical price data for any symbol you pass in. I used this module with great success to download ten years of historical data for every symbol I've ever traded. Regards, Trent. -- http://mail.python.org/mailman/listinfo/python-list
Re: Financial time series data
On 03-Sep-10 1:48 PM, Frederic Rentsch wrote: And do let us know if you get an answer from Yahoo. Hacks like this are unreliable. They fail almost certainly the next time a page gets redesigned, which can be any time. Indeed -- see my other post (regarding ystockquote.py). There's a CSV HTTP API that should be used if you want to obtain any Yahoo! Finance data programmatically. Trent. -- http://mail.python.org/mailman/listinfo/python-list
Re: Financial time series data
On Fri, 2010-09-03 at 19:58 +0200, Virgil Stokes wrote: import urllib2 import re def get_SP500_symbolsX (): symbols = [] lsttradestr = re.compile('Last Trade:') k = 0 for page in range(10): url = 'http://finance.yahoo.com/q/cp?s=%5EGSPCc='+str(page) print url f = urllib2.urlopen (url) html = f.readlines () f.close () for line in html: if line.lstrip ().startswith ('/scriptspan id=yfs_params_vcr'): line_split = line.split (':') s = [item.strip ().upper () for item in line_split [5].replace ('','').split (',')] for symb in s: url = http://finance.yahoo.com/q?s=+symb f = urllib2.urlopen(url) html = f.readlines() f.close() for line in html: if lsttradestr.search(line): k += 1 print 'k = %3d (%s)' %(k,symb) # Here is where I will extract the numerical values and place # # them in an approrpriate file symbols.extend (s [:-3]) return symbols # Not quite 500 -- which is correct (for example p. 2 has only 49 symbols!) # Actually the SP 500 as shown does not contain 500 stocks (symbols) symbols = get_SP500_symbolsX() pass And thanks for your help Frederic --- Have a good day! :-) --V Good going! You get the idea. Here's my try for a cleaned-up version that makes the best use of the facility and takes only fifteen seconds to complete (on my machine). You may want to look at historical quotes too. Trent Nelson seems to have a ready-made solution for this. --- import urllib2 import re def get_current_SP500_quotes_from_Yahoo (): symbol_reader = re.compile ('([a-z-.]+,)+[a-z-.]+') # Make sure you include all characters that may show up in symbols, csv_data = '' for page in range (10): url = 'http://finance.yahoo.com/q/cp?s=%5EGSPCc=' + str (page) print url f = urllib2.urlopen (url) html = f.readlines () f.close () for line in html: if line.lstrip ().startswith ('/scriptspan id=yfs_params_vcr'): symbols = symbol_reader.search (line).group () ## symbols = line.split (':')[5][2:-18] ## ^ This is an alternative to the regex. It won't stumble over ## unexpected characters in symbols, but depends on the line ## line format to stay put. # print symbols.count (',') + 1 # Uncomment to check for = 50 url = 'http://download.finance.yahoo.com/d/quotes.csv?s=% sf=sl1d1t1c1ohgve=.csv' % symbols # Regex happens to grab symbols correctly formatted # print url f = urllib2.urlopen (url) csv_data += f.read () f.close () break return csv_data --- Here is what you get: A,29.85,9/3/2010,4:01pm,+0.64,29.49,29.99,29.49,2263815 AA,10.88,9/3/2010,4:00pm,+0.05,11.01,11.07,10.82,16634520 AEE,28.65,9/3/2010,4:01pm,+0.19,28.79,28.79,28.46,3029885 ... 494 lines in all (today) Symbol, Current or close, Date, Time, Change, Open, High, Low, Volume --- Good luck to you in the footsteps of Warren Buffet! Frederic -- http://mail.python.org/mailman/listinfo/python-list
Re: Financial time series data
I write some object for Taiwan Stock ... http://github.com/toomore/goristock But still dev ... On Sep 3, 1:12 am, Virgil Stokes v...@it.uu.se wrote: Has anyone written code or worked with Python software for downloading financial time series data (e.g. from Yahoo financial)? If yes, would you please contact me. --Thanks, V. Stokes -- http://mail.python.org/mailman/listinfo/python-list
Re: Financial time series data
A more direct question on accessing stock information from Yahoo. First, use your browser to go to: http://finance.yahoo.com/q/cp?s=%5EGSPC+Components Now, you see the first 50 rows of a 500 row table of information on SP 500 index. You can LM click on 1 -50 of 500 |First|Previous|Next|Last below the table to position to any of the 10 pages. I would like to use Python to do the following. *Loop on each of the 10 pages and for each page extract information for each row --- How can this be accomplished automatically in Python?* Let's take the first page (as shown by default). It is easy to see the link to the data for A is http://finance.yahoo.com/q?s=A. That is, I can just move my cursor over the A and I see this URL in the message at the bottom of my browser (Explorer 8). If I LM click on A then I will go to this link --- Do this! You should now see a table which shows information on this stock and *this is the information that I would like to extract*. I would like to do this for all 500 stocks without the need to enter the symbols for them (e.g. A, AA, etc.). It seems clear that this should be possible since all the symbols are in the first column of each of the 50 tables --- but it is not at all clear how to extract these automatically in Python. Hopefully, you understand my problem. Again, I would like Python to cycle through these 10 pages and extract this information for each symbol in this table. --V -- http://mail.python.org/mailman/listinfo/python-list
Re: Financial time series data
On Sep 2, 1:12 pm, Virgil Stokes v...@it.uu.se wrote: Has anyone written code or worked with Python software for downloading financial time series data (e.g. from Yahoo financial)? If yes, would you please contact me. --Thanks, V. Stokes matplotlib has a finance module you can refer to. (matplotlib.finance.fetch_historical_yahoo) see the example: http://matplotlib.sourceforge.net/examples/pylab_examples/finance_work2.html -- http://mail.python.org/mailman/listinfo/python-list
Re: Financial time series data
On Fri, 2010-09-03 at 13:29 +0200, Virgil Stokes wrote: A more direct question on accessing stock information from Yahoo. First, use your browser to go to: http://finance.yahoo.com/q/cp?s=% 5EGSPC+Components Now, you see the first 50 rows of a 500 row table of information on SP 500 index. You can LM click on 1 -50 of 500 |First|Previous|Next|Last below the table to position to any of the 10 pages. I would like to use Python to do the following. Loop on each of the 10 pages and for each page extract information for each row --- How can this be accomplished automatically in Python? Let's take the first page (as shown by default). It is easy to see the link to the data for A is http://finance.yahoo.com/q?s=A. That is, I can just move my cursor over the A and I see this URL in the message at the bottom of my browser (Explorer 8). If I LM click on A then I will go to this link --- Do this! You should now see a table which shows information on this stock and this is the information that I would like to extract. I would like to do this for all 500 stocks without the need to enter the symbols for them (e.g. A, AA, etc.). It seems clear that this should be possible since all the symbols are in the first column of each of the 50 tables --- but it is not at all clear how to extract these automatically in Python. Hopefully, you understand my problem. Again, I would like Python to cycle through these 10 pages and extract this information for each symbol in this table. --V Here's a quick hack to get the SP500 symbols from the visual page with the index letters. From this collection you can then order fifty at a time from the download facility. (If you get a better idea from Yahoo, you'll post it of course.) def get_SP500_symbols (): import urllib symbols = [] url = 'http://finance.yahoo.com/q/cp?s=^GSPCalpha=%c' for c in [chr(n) for n in range (ord ('A'), ord ('Z') + 1)]: print url % c f = urllib.urlopen (url % c) html = f.readlines () f.close () for line in html: if line.lstrip ().startswith ('/scriptspan id=yfs_params_vcr'): line_split = line.split (':') s = [item.strip ().upper () for item in line_split [5].replace ('', '').split (',')] symbols.extend (s [:-3]) return symbols # Not quite 500 (!?) Frederic -- http://mail.python.org/mailman/listinfo/python-list
Re: Financial time series data
On Fri, 2010-09-03 at 16:48 +0200, Virgil Stokes wrote: On 03-Sep-2010 15:45, Frederic Rentsch wrote: On Fri, 2010-09-03 at 13:29 +0200, Virgil Stokes wrote: A more direct question on accessing stock information from Yahoo. First, use your browser to go to: http://finance.yahoo.com/q/cp?s=% 5EGSPC+Components Now, you see the first 50 rows of a 500 row table of information on SP 500 index. You can LM click on 1 -50 of 500 |First|Previous|Next|Last below the table to position to any of the 10 pages. I would like to use Python to do the following. Loop on each of the 10 pages and for each page extract information for each row --- How can this be accomplished automatically in Python? Let's take the first page (as shown by default). It is easy to see the link to the data for A is http://finance.yahoo.com/q?s=A. That is, I can just move my cursor over the A and I see this URL in the message at the bottom of my browser (Explorer 8). If I LM click on A then I will go to this link --- Do this! You should now see a table which shows information on this stock and this is the information that I would like to extract. I would like to do this for all 500 stocks without the need to enter the symbols for them (e.g. A, AA, etc.). It seems clear that this should be possible since all the symbols are in the first column of each of the 50 tables --- but it is not at all clear how to extract these automatically in Python. Hopefully, you understand my problem. Again, I would like Python to cycle through these 10 pages and extract this information for each symbol in this table. --V Here's a quick hack to get the SP500 symbols from the visual page with the index letters. From this collection you can then order fifty at a time from the download facility. (If you get a better idea from Yahoo, you'll post it of course.) def get_SP500_symbols (): import urllib symbols = [] url = 'http://finance.yahoo.com/q/cp?s=^GSPCalpha=%c' for c in [chr(n) for n in range (ord ('A'), ord ('Z') + 1)]: print url % c f = urllib.urlopen (url % c) html = f.readlines () f.close () for line in html: if line.lstrip ().startswith ('/scriptspan id=yfs_params_vcr'): line_split = line.split (':') s = [item.strip ().upper () for item in line_split [5].replace ('', '').split (',')] symbols.extend (s [:-3]) return symbols # Not quite 500 (!?) Frederic I made a few modifications --- very minor. But, I believe that it is a little faster. import urllib2 def get_SP500_symbolsX (): symbols = [] for page in range(0,9): url = 'http://finance.yahoo.com/q/cp?s=%5EGSPCc='+str(page) print url f = urllib2.urlopen (url) html = f.readlines () f.close () for line in html: if line.lstrip ().startswith ('/scriptspan id=yfs_params_vcr'): line_split = line.split (':') s = [item.strip ().upper () for item in line_split [5].replace ('','').split (',')] symbols.extend (s [:-3]) return symbols # Not quite 500 -- which is correct (for example p. 2 has only 49 symbols!) # Actually the SP 500 as shown does not contain 500 stocks (symbols) symbols = get_SP500_symbolsX() pass Oh, yes, and there's no use reading lines to the end once the symbols are in the bag. The symbol-line-finder conditional section should end with break. And do let us know if you get an answer from Yahoo. Hacks like this are unreliable. They fail almost certainly the next time a page gets redesigned, which can be any time. Frederic -- http://mail.python.org/mailman/listinfo/python-list
Re: Financial time series data
But what kind of data you want to download?, because the financial time it's basicly html code and you can work very well with a parser 2010/9/2, Virgil Stokes v...@it.uu.se: Has anyone written code or worked with Python software for downloading financial time series data (e.g. from Yahoo financial)? If yes, would you please contact me. --Thanks, V. Stokes -- http://mail.python.org/mailman/listinfo/python-list -- Enviado desde mi dispositivo móvil Diego I. Hidalgo D. -- http://mail.python.org/mailman/listinfo/python-list
Re: Financial time series data
On 09/02/2010 08:15 PM, Hidura wrote: But what kind of data you want to download?, because the financial time it's basicly html code and you can work very well with a parser 2010/9/2, Virgil Stokesv...@it.uu.se: Has anyone written code or worked with Python software for downloading financial time series data (e.g. from Yahoo financial)? If yes, would you please contact me. --Thanks, V. Stokes -- http://mail.python.org/mailman/listinfo/python-list Here is a snippet of python code that I am trying to use for downloading financial data; but, I do not understand why it returns information from the second HTML page. import urllib2 ''' I am trying to read each row of the table at: http://finance.yahoo.com/q/cp?s=^GSPC ''' ticker = [] url = urllib2.urlopen(http://download.finance.yahoo.com/d/quotes.csv...@%5egspcf=sl1d1t1c1ohgve=.csvh=PAGE.replace('PAGE', str(0))) data = url.read() Note, it does get all 50 rows of the first page; but, why does it also get the first row of the next HTML page? --V -- http://mail.python.org/mailman/listinfo/python-list
Re: Financial time series data
On 03/09/2010 00:56, Virgil Stokes wrote: On 09/02/2010 08:15 PM, Hidura wrote: But what kind of data you want to download?, because the financial time it's basicly html code and you can work very well with a parser 2010/9/2, Virgil Stokesv...@it.uu.se: Has anyone written code or worked with Python software for downloading financial time series data (e.g. from Yahoo financial)? If yes, would you please contact me. --Thanks, V. Stokes -- http://mail.python.org/mailman/listinfo/python-list Here is a snippet of python code that I am trying to use for downloading financial data; but, I do not understand why it returns information from the second HTML page. import urllib2 ''' I am trying to read each row of the table at: http://finance.yahoo.com/q/cp?s=^GSPC ''' ticker = [] url = urllib2.urlopen(http://download.finance.yahoo.com/d/quotes.csv...@%5egspcf=sl1d1t1c1ohgve=.csvh=PAGE.replace('PAGE', str(0))) data = url.read() Note, it does get all 50 rows of the first page; but, why does it also get the first row of the next HTML page? Did you try downloading from a browser? That also returns an extra row. Looks like an idiosyncrasy of the site. -- http://mail.python.org/mailman/listinfo/python-list
Re: Re: Financial time series data
I've tried to see the page and the code GSPC it's wrong i has used ^DJI, and when you download the page code use a xml parser localize the table element and read it. I can't access from the browser to the next page it doesn't appear as a link. El , Virgil Stokes v...@it.uu.se escribió: On 09/02/2010 08:15 PM, Hidura wrote: But what kind of data you want to download?, because the financial time it's basicly html code and you can work very well with a parser 2010/9/2, Virgil stoke...@it.uu.se: Has anyone written code or worked with Python software for downloading financial time series data (eg from Yahoo financial)? If yes, would you please contact me. --Thanks, V. Stokes -- http://mail.python.org/mailman/listinfo/python-list Here is a snippet of python code that I am trying to use for downloading financial data; but, I do not understand why it returns information from the second HTML page. import urllib2 ''' I am trying to read each row of the table at: http://finance.yahoo.com/q/cp?s=^GSPC ''' ticker = [] url = urllib2.urlopen(http://download.finance.yahoo.com/d/quotes.csv...@%5egspcf=sl1d1t1c1ohgve=.csvh=PAGE.replace('PAGE', str(0))) data = url.read() Note, it does get all 50 rows of the first page; but, why does it also get the first row of the next HTML page? --V -- http://mail.python.org/mailman/listinfo/python-list