Re: Financial time series data

2010-09-04 Thread Trent Nelson

On 03-Sep-10 7:29 AM, Virgil Stokes wrote:

  A more direct question on accessing stock information from Yahoo.

First, use your browser to go to:
http://finance.yahoo.com/q/cp?s=%5EGSPC+Components

Now, you see the first 50 rows of a 500 row table of information on SP
500 index. You can LM click on

   1 -50 of 500 |First|Previous|Next|Last

below the table to position to any of the 10 pages.

I would like to use Python to do the following.

*Loop on each of the 10 pages and for each page extract information for
each row --- How can this be accomplished automatically in Python?*

Let's take the first page (as shown by default). It is easy to see the
link to the data for A is http://finance.yahoo.com/q?s=A. That is, I
can just move
my cursor over the A and I see this URL in the message at the bottom
of my browser (Explorer 8). If I LM click on A then I will go to this
link --- Do this!

You should now see a table which shows information on this stock and
*this is the information that I would like to extract*. I would like to
do this for all 500 stocks without the need to enter the symbols for
them (e.g. A, AA, etc.). It seems clear that this should be possible
since all the symbols are in the first column of each of the 50 tables
--- but it is not at all clear how to extract these automatically in
Python.

Hopefully, you understand my problem. Again, I would like Python to
cycle through these 10 pages and extract this information for each
symbol in this table.

--V


You want the 'get_historical_prices' method of the (beautifully elegant) 
'ystockquote.py': http://www.goldb.org/ystockquote.html.


Just specify start date and end date and wallah, you get an array of 
historical price data for any symbol you pass in.  I used this module 
with great success to download ten years of historical data for every 
symbol I've ever traded.


Regards,

Trent.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Financial time series data

2010-09-04 Thread Trent Nelson

On 03-Sep-10 1:48 PM, Frederic Rentsch wrote:


And do let us know if you get an answer from Yahoo. Hacks like this
are unreliable. They fail almost certainly the next time a page gets
redesigned, which can be any time.


Indeed -- see my other post (regarding ystockquote.py).  There's a CSV 
HTTP API that should be used if you want to obtain any Yahoo! Finance 
data programmatically.


Trent.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Financial time series data

2010-09-04 Thread Frederic Rentsch
On Fri, 2010-09-03 at 19:58 +0200, Virgil Stokes wrote:
 import urllib2
 import re
 
 def get_SP500_symbolsX ():
 symbols = []
 lsttradestr = re.compile('Last Trade:')
 k = 0
 for page in range(10):
url = 'http://finance.yahoo.com/q/cp?s=%5EGSPCc='+str(page)
print url
f = urllib2.urlopen (url)
html = f.readlines ()
f.close ()
for line in html:
   if line.lstrip ().startswith ('/scriptspan id=yfs_params_vcr'):
  line_split = line.split (':')
  s = [item.strip ().upper () for item in line_split [5].replace 
 ('','').split (',')]
  for symb in s:
 url = http://finance.yahoo.com/q?s=+symb
 f = urllib2.urlopen(url)
 html = f.readlines()
 f.close()
 
 for line in html:
if lsttradestr.search(line):
   k += 1
   print 'k = %3d (%s)' %(k,symb)
   # Here is where I will extract the numerical values and place
   # 
   #  them in an approrpriate file
  symbols.extend (s [:-3])
 
 return symbols
 # Not quite 500 -- which is correct (for example p. 2 has only 49 
 symbols!)
 # Actually the SP 500 as shown does not contain 500 stocks (symbols)
 
 
 symbols = get_SP500_symbolsX()
 pass
 
 And thanks for your help Frederic --- Have a good day! :-)
 
 --V

Good going! You get the idea. 
   Here's my try for a cleaned-up version that makes the best use of the
facility and takes only fifteen seconds to complete (on my machine).
   You may want to look at historical quotes too. Trent Nelson seems to
have a ready-made solution for this.

---

import urllib2
import re

def get_current_SP500_quotes_from_Yahoo ():

symbol_reader = re.compile ('([a-z-.]+,)+[a-z-.]+')
# Make sure you include all characters that may show up in symbols,

csv_data = ''

for page in range (10):

   url = 'http://finance.yahoo.com/q/cp?s=%5EGSPCc=' + str (page)
   print url
   f = urllib2.urlopen (url)
   html = f.readlines ()
   f.close ()

   for line in html:

  if line.lstrip ().startswith ('/scriptspan
id=yfs_params_vcr'):
 symbols = symbol_reader.search (line).group ()
 ## symbols = line.split (':')[5][2:-18]
 ## ^ This is an alternative to the regex. It won't stumble
over 
 ## unexpected characters in symbols, but depends on the
line 
 ## line format to stay put. 
 # print symbols.count (',') + 1   # Uncomment to check for
= 50
 url = 'http://download.finance.yahoo.com/d/quotes.csv?s=%
sf=sl1d1t1c1ohgve=.csv' % symbols  # Regex happens to grab symbols
correctly formatted
 # print url
 f = urllib2.urlopen (url)
 csv_data += f.read ()
 f.close ()

 break

return csv_data
   
---

Here is what you get:

A,29.85,9/3/2010,4:01pm,+0.64,29.49,29.99,29.49,2263815
AA,10.88,9/3/2010,4:00pm,+0.05,11.01,11.07,10.82,16634520
AEE,28.65,9/3/2010,4:01pm,+0.19,28.79,28.79,28.46,3029885
... 494 lines in all (today) 

Symbol, Current or close, Date, Time, Change, Open, High, Low, Volume


---

Good luck to you in the footsteps of Warren Buffet!

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Financial time series data

2010-09-04 Thread Toomore
I write some object for Taiwan Stock ...
http://github.com/toomore/goristock

But still dev ...

On Sep 3, 1:12 am, Virgil Stokes v...@it.uu.se wrote:
   Has anyone written code or worked with Python software for downloading
 financial time series data (e.g. from Yahoo financial)? If yes,  would you
 please contact me.

 --Thanks,
 V. Stokes
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Financial time series data

2010-09-03 Thread Virgil Stokes

 A more direct question on accessing stock information from Yahoo.

First, use your browser to go to:  
http://finance.yahoo.com/q/cp?s=%5EGSPC+Components


Now, you see the first 50 rows of a 500 row table of information on SP 500 
index. You can LM click on


  1 -50 of 500 |First|Previous|Next|Last

below the table to position to any of the 10 pages.

I would like to use Python to do the following.

*Loop on each of the 10 pages and for each page extract information for each row 
--- How can this be accomplished automatically in Python?*


Let's take the first page (as shown by default). It is easy to see the link to 
the data for A is http://finance.yahoo.com/q?s=A. That is, I can just move
my cursor over the A and I see this URL in the message at the bottom of my 
browser (Explorer 8). If I LM click on A then I will go to this

link --- Do this!

You should now see a table which shows information on this stock and *this is 
the information that I would like to extract*. I would like to do this for all 
500 stocks without the need to enter the symbols for them (e.g. A, AA, 
etc.). It seems clear that this should be possible since all the symbols are in 
the first column of each of the 50 tables --- but it is not at all clear how to 
extract these automatically in Python.


Hopefully, you understand my problem. Again, I would like Python to cycle 
through these 10 pages and extract this information for each symbol in this table.


--V




-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Financial time series data

2010-09-03 Thread Nanjundi
On Sep 2, 1:12 pm, Virgil Stokes v...@it.uu.se wrote:
   Has anyone written code or worked with Python software for downloading
 financial time series data (e.g. from Yahoo financial)? If yes,  would you
 please contact me.

 --Thanks,
 V. Stokes

matplotlib has a finance module you can refer to.
(matplotlib.finance.fetch_historical_yahoo)
see the example: 
http://matplotlib.sourceforge.net/examples/pylab_examples/finance_work2.html
-- 
http://mail.python.org/mailman/listinfo/python-list



Re: Financial time series data

2010-09-03 Thread Frederic Rentsch
On Fri, 2010-09-03 at 13:29 +0200, Virgil Stokes wrote:
 A more direct question on accessing stock information from Yahoo.
 
 First, use your browser to go to:  http://finance.yahoo.com/q/cp?s=%
 5EGSPC+Components
 
 Now, you see the first 50 rows of a 500 row table of information on
 SP 500 index. You can LM click on
 
   1 -50 of 500 |First|Previous|Next|Last
 
 below the table to position to any of the 10 pages.
 
 I would like to use Python to do the following.
 
 Loop on each of the 10 pages and for each page extract information for
 each row --- How can this be accomplished automatically in Python?
 
 Let's take the first page (as shown by default). It is easy to see the
 link to the data for A is http://finance.yahoo.com/q?s=A. That is, I
 can just move 
 my cursor over the A and I see this URL in the message at the bottom
 of my browser (Explorer 8). If I LM click on A then I will go to
 this
 link --- Do this!
 
 You should now see a table which shows information on this stock and
 this is the information that I would like to extract. I would like to
 do this for all 500 stocks without the need to enter the symbols for
 them (e.g. A, AA, etc.). It seems clear that this should be
 possible since all the symbols are in the first column of each of the
 50 tables --- but it is not at all clear how to extract these
 automatically in Python. 
 
 Hopefully, you understand my problem. Again, I would like Python to
 cycle through these 10 pages and extract this information for each
 symbol in this table.
 
 --V
 
 
 

Here's a quick hack to get the SP500 symbols from the visual page with
the index letters. From this collection you can then order fifty at a
time from the download facility. (If you get a better idea from Yahoo,
you'll post it of course.)



def get_SP500_symbols ():
import urllib
symbols = []
url = 'http://finance.yahoo.com/q/cp?s=^GSPCalpha=%c'
for c in [chr(n) for n in range (ord ('A'), ord ('Z') + 1)]:

print url % c
f = urllib.urlopen (url % c)
html = f.readlines ()
f.close ()
for line in html:
if line.lstrip ().startswith ('/scriptspan 
id=yfs_params_vcr'):
line_split = line.split (':')
s = [item.strip ().upper () for item in 
line_split [5].replace ('',
'').split (',')]
symbols.extend (s [:-3])

return symbols 
# Not quite 500 (!?)


Frederic



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Financial time series data

2010-09-03 Thread Frederic Rentsch
On Fri, 2010-09-03 at 16:48 +0200, Virgil Stokes wrote:
 On 03-Sep-2010 15:45, Frederic Rentsch wrote:
  On Fri, 2010-09-03 at 13:29 +0200, Virgil Stokes wrote:
  A more direct question on accessing stock information from Yahoo.
 
  First, use your browser to go to:  http://finance.yahoo.com/q/cp?s=%
  5EGSPC+Components
 
  Now, you see the first 50 rows of a 500 row table of information on
  SP 500 index. You can LM click on
 
 1 -50 of 500 |First|Previous|Next|Last
 
  below the table to position to any of the 10 pages.
 
  I would like to use Python to do the following.
 
  Loop on each of the 10 pages and for each page extract information for
  each row --- How can this be accomplished automatically in Python?
 
  Let's take the first page (as shown by default). It is easy to see the
  link to the data for A is http://finance.yahoo.com/q?s=A. That is, I
  can just move
  my cursor over the A and I see this URL in the message at the bottom
  of my browser (Explorer 8). If I LM click on A then I will go to
  this
  link --- Do this!
 
  You should now see a table which shows information on this stock and
  this is the information that I would like to extract. I would like to
  do this for all 500 stocks without the need to enter the symbols for
  them (e.g. A, AA, etc.). It seems clear that this should be
  possible since all the symbols are in the first column of each of the
  50 tables --- but it is not at all clear how to extract these
  automatically in Python.
 
  Hopefully, you understand my problem. Again, I would like Python to
  cycle through these 10 pages and extract this information for each
  symbol in this table.
 
  --V
 
 
 
  Here's a quick hack to get the SP500 symbols from the visual page with
  the index letters. From this collection you can then order fifty at a
  time from the download facility. (If you get a better idea from Yahoo,
  you'll post it of course.)
 
 
 
  def get_SP500_symbols ():
  import urllib
  symbols = []
  url = 'http://finance.yahoo.com/q/cp?s=^GSPCalpha=%c'
  for c in [chr(n) for n in range (ord ('A'), ord ('Z') + 1)]:
  
  print url % c
  f = urllib.urlopen (url % c)
  html = f.readlines ()
  f.close ()
  for line in html:
  if line.lstrip ().startswith ('/scriptspan 
  id=yfs_params_vcr'):
  line_split = line.split (':')
  s = [item.strip ().upper () for item in 
  line_split [5].replace ('',
  '').split (',')]
  symbols.extend (s [:-3])
 
  return symbols
  # Not quite 500 (!?)
 
 
  Frederic
 
 
 
 I made a few modifications --- very minor. But, I believe that it is a little 
 faster.
 
 import urllib2
 
 def get_SP500_symbolsX ():
 symbols = []
 for page in range(0,9):
url = 'http://finance.yahoo.com/q/cp?s=%5EGSPCc='+str(page)
print url
f = urllib2.urlopen (url)
html = f.readlines ()
f.close ()
for line in html:
   if line.lstrip ().startswith ('/scriptspan id=yfs_params_vcr'):
  line_split = line.split (':')
  s = [item.strip ().upper () for item in line_split [5].replace 
 ('','').split (',')]
  symbols.extend (s [:-3])
 
 return symbols
 # Not quite 500 -- which is correct (for example p. 2 has only 49 
 symbols!)
 # Actually the SP 500 as shown does not contain 500 stocks (symbols)
 
 
 symbols = get_SP500_symbolsX()
 pass

Oh, yes, and there's no use reading lines to the end once the symbols
are in the bag. The symbol-line-finder conditional section should end
with break.
   And do let us know if you get an answer from Yahoo. Hacks like this
are unreliable. They fail almost certainly the next time a page gets
redesigned, which can be any time. 

Frederic
 

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Financial time series data

2010-09-02 Thread Hidura
But what kind of data you want to download?, because the financial
time it's basicly html code and you can work very well with a parser

2010/9/2, Virgil Stokes v...@it.uu.se:
   Has anyone written code or worked with Python software for downloading
 financial time series data (e.g. from Yahoo financial)? If yes,  would you
 please contact me.

 --Thanks,
 V. Stokes
 --
 http://mail.python.org/mailman/listinfo/python-list


-- 
Enviado desde mi dispositivo móvil

Diego I. Hidalgo D.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Financial time series data

2010-09-02 Thread Virgil Stokes

On 09/02/2010 08:15 PM, Hidura wrote:

But what kind of data you want to download?, because the financial
time it's basicly html code and you can work very well with a parser

2010/9/2, Virgil Stokesv...@it.uu.se:
   

   Has anyone written code or worked with Python software for downloading
financial time series data (e.g. from Yahoo financial)? If yes,  would you
please contact me.

--Thanks,
V. Stokes
--
http://mail.python.org/mailman/listinfo/python-list

 
   
Here is a snippet of python code that I am trying to use for downloading 
financial data; but, I do not understand why it returns information from 
the second HTML page.


  import urllib2
  '''
   I am trying to read each row of the table at:
http://finance.yahoo.com/q/cp?s=^GSPC
  '''
  ticker = []
  url = 
urllib2.urlopen(http://download.finance.yahoo.com/d/quotes.csv...@%5egspcf=sl1d1t1c1ohgve=.csvh=PAGE.replace('PAGE', 
str(0)))

  data = url.read()

Note, it does get all 50 rows of the first page; but, why does it also 
get the first row of the next HTML page?


--V



--
http://mail.python.org/mailman/listinfo/python-list


Re: Financial time series data

2010-09-02 Thread MRAB

On 03/09/2010 00:56, Virgil Stokes wrote:

On 09/02/2010 08:15 PM, Hidura wrote:

But what kind of data you want to download?, because the financial
time it's basicly html code and you can work very well with a parser

2010/9/2, Virgil Stokesv...@it.uu.se:

Has anyone written code or worked with Python software for downloading
financial time series data (e.g. from Yahoo financial)? If yes, would
you
please contact me.

--Thanks,
V. Stokes
--
http://mail.python.org/mailman/listinfo/python-list


Here is a snippet of python code that I am trying to use for downloading
financial data; but, I do not understand why it returns information from
the second HTML page.

import urllib2
'''
I am trying to read each row of the table at:
http://finance.yahoo.com/q/cp?s=^GSPC
'''
ticker = []
url =
urllib2.urlopen(http://download.finance.yahoo.com/d/quotes.csv...@%5egspcf=sl1d1t1c1ohgve=.csvh=PAGE.replace('PAGE',
str(0)))
data = url.read()

Note, it does get all 50 rows of the first page; but, why does it also
get the first row of the next HTML page?


Did you try downloading from a browser? That also returns an extra row.

Looks like an idiosyncrasy of the site.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Re: Financial time series data

2010-09-02 Thread hidura
I've tried to see the page and the code GSPC it's wrong i has used ^DJI,  
and when you download the page code use a xml parser localize the table  
element and read it. I can't access from the browser to the next page it  
doesn't appear as a link.

El , Virgil Stokes v...@it.uu.se escribió:

On 09/02/2010 08:15 PM, Hidura wrote:




But what kind of data you want to download?, because the financial



time it's basicly html code and you can work very well with a parser





2010/9/2, Virgil stoke...@it.uu.se:






Has anyone written code or worked with Python software for downloading



financial time series data (eg from Yahoo financial)? If yes, would you



please contact me.





--Thanks,



V. Stokes



--



http://mail.python.org/mailman/listinfo/python-list










Here is a snippet of python code that I am trying to use for downloading  
financial data; but, I do not understand why it returns information from  
the second HTML page.





import urllib2



'''



I am trying to read each row of the table at:



http://finance.yahoo.com/q/cp?s=^GSPC



'''



ticker = []


url =  
urllib2.urlopen(http://download.finance.yahoo.com/d/quotes.csv...@%5egspcf=sl1d1t1c1ohgve=.csvh=PAGE.replace('PAGE',  
str(0)))



data = url.read()




Note, it does get all 50 rows of the first page; but, why does it also  
get the first row of the next HTML page?





--V








-- 
http://mail.python.org/mailman/listinfo/python-list