I am using python `web.py` framework to build a small web app.
It consists of a
1. `Home page` that takes a url as input
2. Reads `anchor text` and `anchor tags` from it
3. Writes it to csv file and downloads it
Here the steps 2 and 3 happens when we clicked on a `export the links`
button, below is my code
**code.py**
* import web*
* from web import form*
* import urlparse*
* from urlparse import urlparse as ue*
* import urllib2*
* from BeautifulSoup import BeautifulSoup*
* import csv*
* from cStringIO import StringIO*
* *
* urls = (*
* '/', 'index',*
* '/export', 'export',*
* )*
* *
* app = web.application(urls, globals())*
* render = web.template.render('templates/')*
* *
* class index:*
* def GET(self):*
* return render.home()*
*
*
*
*
* class export:*
* *
* def GET(self):*
* i = web.input()*
* if i.has_key('url') and i['url'] !='':*
* url = i['url'] *
* page = urllib2.urlopen(url)*
* html = page.read()*
* page.close()*
*
*
* *
* decoded = ue(url).hostname*
* if decoded.startswith('www.'):*
* decoded = ".".join(decoded.split('.')[1:])*
* file_name = str(decoded.split('.')[0])*
*
*
* csv_file = StringIO()*
* csv_writer = csv.writer(csv_file)*
* csv_writer.writerow(['Name', 'Link'])*
*
*
* soup = BeautifulSoup(html)*
* for anchor_tag in soup.findAll('a', href=True): *
*
csv_writer.writerow([anchor_tag.text,anchor_tag['href']]) *
* web.header('Content-Type','text/csv') *
* web.header('Content-disposition', 'attachment;
filename=%s.csv'%file_name)*
* return csv_file.getvalue()*
*
*
* if __name__ == "__main__":*
* app.run()*
*
*
**home.html**:
* $def with()*
* <html>*
* <head>*
* <title>Home Page</title>*
* </head>*
* <body>*
* <form method="GET" action='/export'>*
* <input type="text" name="url" maxlength="500" />*
* <input class="button" type="submit" name="export the links"
value="export the links" />*
* </form>*
* </body>*
* </html>*
The above html code displays a form with a text box that takes a url , and
has button `export the links` button that `downloads/exports` the csv file
with the anchor tag links and text.
1. For example when we submit `http://www.google.co.in` and click `export
the links`, all the anchor urls and anchor text are saving in to csv file
and downloading successfully
2. but for example when we given the other url like
`http://stackoveflow.com` immediately and click `export the links` button,
the csv file (created with domain name of the url as shown in the above
code) is downloading with tag links , but the downloaded csv file also
contains the data(anchor text and links) of the previous url that is
`http://www.google.co.in`.
That is the data is overrriding in the same csv file from different urls,
can anyone please let me know whats wrong in the above code(`export class`)
that generates the csv file, why the data is overwriting instead of
creating a new csv file with the different name created dynamically ?
Finally my intention is to download/export a new csv file with domain
name(sliced as above in my code) of the url by writing data (anchor tag
text and url ) from the url in to it each time when we give the new url.
Can anyone please extend/make necessary changes to my above code to
download an individul csv file for individual url .........
--
You received this message because you are subscribed to the Google Groups
"web.py" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/webpy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.