I'd like to make a suggestion for an enhancement to the Scrape taglib.
 
 here's my suggestion:
 
- The current functionality allows one to specify the 'begin' and 'end'
strings between which to perform the scrape.
 
- Unfortunately some sites have complex html so that trying to find the N'th
'<table>' tag is cumbersome.  (Perhaps the feature is already there, but I
didn't see it in the docs)
 
- Here's some code from the example page shipped with the scrape taglib:
<scrp:scrape id="weather1" begin="<PRE>" end="</PRE>" anchors="true"/>
 
- My suggestion is to include optional parameters which allows developers to
specify the N'th occurance of the begin/end strings.  Perhaps something like
the following:
<scrp:scrape id="weather1" begin="<PRE>" beginOccurance=9 end="</PRE>"
endOccurance=3 anchors="true"/>
 
In the above line the 'scrape' would begin starting the 9'th occurance of
<PRE> and end at the third occurance of </PRE> from that point.  The idea is
that a developer is not necessarily scraping between html tags.  It could be
between any of the html for a given page.  The ability to specify the N'th
occurance of the begin/end strings would make the Scrape taglib even more
powerful.
 
I would do this myself, but I just don't know enough about the taglib
architecture to actually modify and/or create some sample code to
demonstrate the above suggestion.
 

Jay Patel 
Core Energy Solutions
 <http://www.core-energy.com/> www.core-energy.com 
713-215-7690 

 

Reply via email to