lookup xpath (other?) to value in html
I have a about 255 data fields that I am trying to verify on thousands of webpages. For example: value: 255,000 sqft: 1800 Since I have the correct answer for several pages I would like to lookup get the location (xpath?) of the data/field value in the page so that I can check other pages. Any suggestions? Vincent Davis 720-301-3003 -- https://mail.python.org/mailman/listinfo/python-list
Re: lookup xpath (other?) to value in html
I have a about 255 data fields that I am trying to verify on thousands of webpages. For example: value: 255,000 sqft: 1800 Since I have the correct answer for several pages I would like to lookup get the location (xpath?) of the data/field value in the page so that I can check other pages. I'm not sure what you are looking for. Do you have a sample web page, and can you show us the output you'd like to see from that webpage? Have you looked at http://www.crummy.com/software/BeautifulSoup/? -- https://mail.python.org/mailman/listinfo/python-list
Re: lookup xpath (other?) to value in html
I'm not sure what you are looking for. Do you have a sample web page, and can you show us the output you'd like to see from that webpage? Have you looked at http://www.crummy.com/software/BeautifulSoup/? For example this URL; http://jeffco.us/ats/displaygeneral.do?sch=001690 The the land sqft is 11082. Google Chrome gives me the xpath to that data as; //*[@id=content]/p[1]/table[4]/tbody/tr[2]/td[8] What I would like to do (using python) is given 11082 at what xpath can that be found? (may be more that one) The examples I can find using google refer to, given xpath what is the value (the opposite of what I want) Vincent Davis On Tue, Dec 31, 2013 at 6:45 PM, Jason Friedman jsf80...@gmail.com wrote: I have a about 255 data fields that I am trying to verify on thousands of webpages. For example: value: 255,000 sqft: 1800 Since I have the correct answer for several pages I would like to lookup get the location (xpath?) of the data/field value in the page so that I can check other pages. I'm not sure what you are looking for. Do you have a sample web page, and can you show us the output you'd like to see from that webpage? Have you looked at http://www.crummy.com/software/BeautifulSoup/? -- https://mail.python.org/mailman/listinfo/python-list
Re: lookup xpath (other?) to value in html
For example this URL; http://jeffco.us/ats/displaygeneral.do?sch=001690 The the land sqft is 11082. Google Chrome gives me the xpath to that data as; //*[@id=content]/p[1]/table[4]/tbody/tr[2]/td[8] What I would like to do (using python) is given 11082 at what xpath can that be found? (may be more that one) The examples I can find using google refer to, given xpath what is the value (the opposite of what I want) Which Chrome extension are you using to get that path? Are you always interested in the square footage? Here is a solution using Beautiful Soup: $ cat square-feet.py #!/usr/bin/env python import bs4 import requests import sys url = sys.argv[1] request = requests.get(url) soup = bs4.BeautifulSoup(request.text) is_sqft_mark_found, is_total_mark_found = False, False for line in soup.get_text().splitlines(): if line.startswith(Land Sqft): is_sqft_mark_found = True continue elif is_sqft_mark_found and line.startswith(Total): is_total_mark_found = True continue elif is_total_mark_found: print(line.strip() + total square feet.) break $ python3 square-feet.py http://jeffco.us/ats/displaygeneral.do?sch=001690 11082 total square feet. -- https://mail.python.org/mailman/listinfo/python-list
Re: lookup xpath (other?) to value in html
Which Chrome extension are you using to get that path? Built in, right click on source copy xpath Ya that gets square footage and I like how you did it, are you interested in doing that for all information on the page and also the historical pages ;-) Since I have the data for some of the pages, I got this from the county on a cd, I thought defining the xpath would be easier using bs4 or http://lxml.de/ Vincent Davis 720-301-3003 On Tue, Dec 31, 2013 at 10:30 PM, Jason Friedman jsf80...@gmail.com wrote: For example this URL; http://jeffco.us/ats/displaygeneral.do?sch=001690 The the land sqft is 11082. Google Chrome gives me the xpath to that data as; //*[@id=content]/p[1]/table[4]/tbody/tr[2]/td[8] What I would like to do (using python) is given 11082 at what xpath can that be found? (may be more that one) The examples I can find using google refer to, given xpath what is the value (the opposite of what I want) Which Chrome extension are you using to get that path? Are you always interested in the square footage? Here is a solution using Beautiful Soup: $ cat square-feet.py #!/usr/bin/env python import bs4 import requests import sys url = sys.argv[1] request = requests.get(url) soup = bs4.BeautifulSoup(request.text) is_sqft_mark_found, is_total_mark_found = False, False for line in soup.get_text().splitlines(): if line.startswith(Land Sqft): is_sqft_mark_found = True continue elif is_sqft_mark_found and line.startswith(Total): is_total_mark_found = True continue elif is_total_mark_found: print(line.strip() + total square feet.) break $ python3 square-feet.py http://jeffco.us/ats/displaygeneral.do?sch=001690 11082 total square feet. -- https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list