Re: [Tutor] (regular expression)

isaac tetteh Sat, 10 Dec 2016 17:55:05 -0800

this is the real code


with 
urllib.request.urlopen("https://www.sdstate.edu/electrical-engineering-and-computer-science";)
 as cs:
    cs_page = cs.read()
    soup = BeautifulSoup(cs_page, "html.parser")
    print(len(soup.body.find_all(string = ["Engineering","engineering"])))

i used control + f on the link in the code and i get 11 for ctrl + f and 3 for 
the code

THanks




________________________________
From: Tutor <[email protected]> on behalf of Bob 
Gailer <[email protected]>
Sent: Saturday, December 10, 2016 7:54 PM
To: Tetteh, Isaac - SDSU Student
Cc: Python Tutor
Subject: Re: [Tutor] (no subject)

On Dec 10, 2016 12:15 PM, "Tetteh, Isaac - SDSU Student" <
[email protected]> wrote:
>
> Hello,
>
> I am trying to find the number of times a word occurs on a webpage so I
used bs4 code below
>
> Let assume html contains the "html code"
> soup = BeautifulSoup(html, "html.pa<http://html.pa>rser")
> print(len(soup.fi<http://soup.fi
>nd_all(string=["Engineering","engineering"])))
> But the result is different from when i use control + f on my keyboard to
find
>
> Please help me understand why it's different results. Thanks
> I am using Python 3.5
>
What is the URL of the web page?
To what are you applying control-f?
What are the two different counts you're getting?
Is it possible that the page is being dynamically altered after it's loaded?
_______________________________________________
Tutor maillist  -  [email protected]
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor
Tutor Info Page - Python<https://mail.python.org/mailman/listinfo/tutor>
mail.python.org
This list is for folks who want to ask questions regarding how to learn 
computer programming with the Python language and its standard library.



_______________________________________________
Tutor maillist  -  [email protected]
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] (regular expression)

Reply via email to