On 10/03/2021 04:35, S Monzur wrote:
Thanks! I ended up using beautiful soup to remove the html tags and create
three lists (titles of article, publications dates, main body) but am still
facing a problem where the list is not properly storing the main body.
There is something wrong with my code for that section, and any comment
would be really helpful!

  ListFile Text
<https://drive.google.com/file/d/1V3s8w8a3NQvex91EdOhdC9rQtCAOElpm/view?usp=sharing>

How did you create that file?

> BeautifulSoup code for removing tags <https://pastebin.com/qvbVMUGD>

print(bodytext[0]) # so here, I'm only getting the first paragraph of the body 
of the first article, not all of the first article

print(bodytext[1]) # here, I'm getting the second paragraph of the first 
article, and not the second article

It may help if you process the individual articles with beautiful soup, not the whole list at once.

--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to