Re: How to loop over a text file (to remove tags and normalize) using Python

Peter Otten Wed, 10 Mar 2021 01:49:20 -0800

On 10/03/2021 04:35, S Monzur wrote:

Thanks! I ended up using beautiful soup to remove the html tags and create
three lists (titles of article, publications dates, main body) but am still
facing a problem where the list is not properly storing the main body.
There is something wrong with my code for that section, and any comment
would be really helpful!


  ListFile Text
<https://drive.google.com/file/d/1V3s8w8a3NQvex91EdOhdC9rQtCAOElpm/view?usp=sharing>


How did you create that file?

> BeautifulSoup code for removing tags <https://pastebin.com/qvbVMUGD>

print(bodytext[0]) # so here, I'm only getting the first paragraph of the body 
of the first article, not all of the first article

print(bodytext[1]) # here, I'm getting the second paragraph of the first 
article, and not the second article

It may help if you process the individual articles with beautiful soup,not the whole list at once.


--
https://mail.python.org/mailman/listinfo/python-list

Re: How to loop over a text file (to remove tags and normalize) using Python

Reply via email to