I think it may have to do with the fact that you never close the stream
you're writing to. Have you tried printing the output of csv_file before
writing to it? It would help you ensure it was empty before throwing more
lines into it. I added in two comments near the top of the code snippet -
feel free to uncomment the second one for the relevant information.
Anyway, I modified the bottom section of your export class. I've never
worked particularly with either csv or cStringIO, so no guarantees this
fixes your problem, but it seems like leaving the stream open could be
causing issues.
*
csv_file = StringIO()*
* #debugging info*
* #print csv_file.getvalue()*
* csv_writer = csv.writer(csv_file)*
* csv_writer.writerow(['Name', 'Link'])*
*
*
* soup = BeautifulSoup(html)*
* for anchor_tag in soup.findAll('a', href=True): *
*
csv_writer.writerow([anchor_tag.text,anchor_tag['href']]) *
* web.header('Content-Type','text/csv') *
* web.header('Content-disposition', 'attachment;
filename=%s.csv'%file_name)*
* returnval = csv_file.getvalue()*
* csv_file.close()*
* return returnval*
*
*
On Fri, Feb 22, 2013 at 5:03 AM, shiva krishna <[email protected]>wrote:
> I am using python `web.py` framework to build a small web app.
>
> It consists of a
>
> 1. `Home page` that takes a url as input
> 2. Reads `anchor text` and `anchor tags` from it
> 3. Writes it to csv file and downloads it
>
> Here the steps 2 and 3 happens when we clicked on a `export the links`
> button, below is my code
>
> **code.py**
>
>
> * import web*
> * from web import form*
> * import urlparse*
> * from urlparse import urlparse as ue*
> * import urllib2*
> * from BeautifulSoup import BeautifulSoup*
> * import csv*
> * from cStringIO import StringIO*
> * *
> * urls = (*
> * '/', 'index',*
> * '/export', 'export',*
> * )*
> * *
> * app = web.application(urls, globals())*
> * render = web.template.render('templates/')*
> * *
> * class index:*
> * def GET(self):*
> * return render.home()*
> *
> *
> *
> *
> * class export:*
> * *
> * def GET(self):*
> * i = web.input()*
> * if i.has_key('url') and i['url'] !='':*
> * url = i['url'] *
> * page = urllib2.urlopen(url)*
> * html = page.read()*
> * page.close()*
> *
> *
> * *
> * decoded = ue(url).hostname*
> * if decoded.startswith('www.'):*
> * decoded = ".".join(decoded.split('.')[1:])*
> * file_name = str(decoded.split('.')[0])*
> *
> *
> * csv_file = StringIO()*
> * csv_writer = csv.writer(csv_file)*
> * csv_writer.writerow(['Name', 'Link'])*
> *
> *
> * soup = BeautifulSoup(html)*
> * for anchor_tag in soup.findAll('a', href=True): *
> *
> csv_writer.writerow([anchor_tag.text,anchor_tag['href']]) *
> * web.header('Content-Type','text/csv') *
> * web.header('Content-disposition', 'attachment;
> filename=%s.csv'%file_name)*
> * return csv_file.getvalue()*
> *
> *
> * if __name__ == "__main__":*
> * app.run()*
> *
> *
>
> **home.html**:
>
> * $def with()*
> * <html>*
> * <head>*
> * <title>Home Page</title>*
> * </head>*
> * <body>*
> * <form method="GET" action='/export'>*
> * <input type="text" name="url" maxlength="500" />*
> * <input class="button" type="submit" name="export the links"
> value="export the links" />*
> * </form>*
> * </body>*
> * </html>*
>
>
> The above html code displays a form with a text box that takes a url , and
> has button `export the links` button that `downloads/exports` the csv file
> with the anchor tag links and text.
>
> 1. For example when we submit `http://www.google.co.in` and click `export
> the links`, all the anchor urls and anchor text are saving in to csv file
> and downloading successfully
>
> 2. but for example when we given the other url like `
> http://stackoveflow.com` immediately and click `export the links` button,
> the csv file (created with domain name of the url as shown in the above
> code) is downloading with tag links , but the downloaded csv file also
> contains the data(anchor text and links) of the previous url that is `
> http://www.google.co.in`.
>
> That is the data is overrriding in the same csv file from different urls,
> can anyone please let me know whats wrong in the above code(`export class`)
> that generates the csv file, why the data is overwriting instead of
> creating a new csv file with the different name created dynamically ?
>
> Finally my intention is to download/export a new csv file with domain
> name(sliced as above in my code) of the url by writing data (anchor tag
> text and url ) from the url in to it each time when we give the new url.
>
> Can anyone please extend/make necessary changes to my above code to
> download an individul csv file for individual url .........
>
> --
> You received this message because you are subscribed to the Google Groups
> "web.py" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/webpy?hl=en.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>
--
You received this message because you are subscribed to the Google Groups
"web.py" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/webpy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.