lookup xpath (other?) to value in html

2013-12-31 Thread Vincent Davis
I have a about 255 data fields that I am trying to verify on thousands of
webpages.
For example:
value: 255,000
sqft: 1800

Since I have the correct answer for several pages I would like to lookup
get the location (xpath?) of the data/field value in the page so that I can
check other pages.

Any suggestions?

Vincent Davis
720-301-3003
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: lookup xpath (other?) to value in html

2013-12-31 Thread Jason Friedman
 I have a about 255 data fields that I am trying to verify on thousands of
 webpages.
 For example:
 value: 255,000
 sqft: 1800
 
 Since I have the correct answer for several pages I would like to lookup get
 the location (xpath?) of the data/field value in the page so that I can
 check other pages.

I'm not sure what you are looking for.  Do you have a sample web page,
and can you show us the output you'd like to see from that webpage?
Have you looked at http://www.crummy.com/software/BeautifulSoup/?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: lookup xpath (other?) to value in html

2013-12-31 Thread Vincent Davis

 I'm not sure what you are looking for.  Do you have a sample web page,
 and can you show us the output you'd like to see from that webpage?
 Have you looked at http://www.crummy.com/software/BeautifulSoup/?


For example this URL;
http://jeffco.us/ats/displaygeneral.do?sch=001690
The the land sqft is 11082.
Google Chrome gives me the xpath to that data as;
//*[@id=content]/p[1]/table[4]/tbody/tr[2]/td[8]

What I would like to do (using python) is given 11082 at what xpath can
that be found? (may be more that one)
The examples I can find using google refer to, given xpath what is the
value (the opposite of what I want)

Vincent Davis


On Tue, Dec 31, 2013 at 6:45 PM, Jason Friedman jsf80...@gmail.com wrote:

  I have a about 255 data fields that I am trying to verify on thousands of
  webpages.
  For example:
  value: 255,000
  sqft: 1800
  
  Since I have the correct answer for several pages I would like to lookup
 get
  the location (xpath?) of the data/field value in the page so that I can
  check other pages.

 I'm not sure what you are looking for.  Do you have a sample web page,
 and can you show us the output you'd like to see from that webpage?
 Have you looked at http://www.crummy.com/software/BeautifulSoup/?

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: lookup xpath (other?) to value in html

2013-12-31 Thread Jason Friedman
 For example this URL;
 http://jeffco.us/ats/displaygeneral.do?sch=001690
 The the land sqft is 11082.
 Google Chrome gives me the xpath to that data as;
 //*[@id=content]/p[1]/table[4]/tbody/tr[2]/td[8]

 What I would like to do (using python) is given 11082 at what xpath can that
 be found? (may be more that one)
 The examples I can find using google refer to, given xpath what is the value
 (the opposite of what I want)

Which Chrome extension are you using to get that path?

Are you always interested in the square footage?  Here is a solution
using Beautiful Soup:

$ cat square-feet.py
#!/usr/bin/env python
import bs4
import requests
import sys
url = sys.argv[1]
request = requests.get(url)
soup = bs4.BeautifulSoup(request.text)
is_sqft_mark_found, is_total_mark_found = False, False
for line in soup.get_text().splitlines():
if line.startswith(Land Sqft):
is_sqft_mark_found = True
continue
elif is_sqft_mark_found and line.startswith(Total):
is_total_mark_found = True
continue
elif is_total_mark_found:
print(line.strip() +  total square feet.)
break

$ python3 square-feet.py http://jeffco.us/ats/displaygeneral.do?sch=001690
11082 total square feet.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: lookup xpath (other?) to value in html

2013-12-31 Thread Vincent Davis

 Which Chrome extension are you using to get that path?

Built in, right click on source  copy xpath​​

Ya that gets square footage and I like how you did it, are you interested
in doing that for all information on the page and also the historical pages
;-)
Since I have the data for some of the pages, I got this from the county on
a cd, I thought defining the xpath would be easier using bs4 or
http://lxml.de/




Vincent Davis
720-301-3003


On Tue, Dec 31, 2013 at 10:30 PM, Jason Friedman jsf80...@gmail.com wrote:

  For example this URL;
  http://jeffco.us/ats/displaygeneral.do?sch=001690
  The the land sqft is 11082.
  Google Chrome gives me the xpath to that data as;
  //*[@id=content]/p[1]/table[4]/tbody/tr[2]/td[8]
 
  What I would like to do (using python) is given 11082 at what xpath can
 that
  be found? (may be more that one)
  The examples I can find using google refer to, given xpath what is the
 value
  (the opposite of what I want)

 Which Chrome extension are you using to get that path?

 Are you always interested in the square footage?  Here is a solution
 using Beautiful Soup:

 $ cat square-feet.py
 #!/usr/bin/env python
 import bs4
 import requests
 import sys
 url = sys.argv[1]
 request = requests.get(url)
 soup = bs4.BeautifulSoup(request.text)
 is_sqft_mark_found, is_total_mark_found = False, False
 for line in soup.get_text().splitlines():
 if line.startswith(Land Sqft):
 is_sqft_mark_found = True
 continue
 elif is_sqft_mark_found and line.startswith(Total):
 is_total_mark_found = True
 continue
 elif is_total_mark_found:
 print(line.strip() +  total square feet.)
 break

 $ python3 square-feet.py http://jeffco.us/ats/displaygeneral.do?sch=001690
 11082 total square feet.
 --
 https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list