Hi Jevon,
 
Yes I was hoping to simply check every links of an application whislt
running integration tests with Maven.  The aim is to quickly check that
there are no exceptions thrown.
 
I have come up with the following code, which may be useful to others:
 
private boolean useDiffPagesOnly = true; // only visit a page once, i.e.
not for different parameters
 
 public void testSpider() {
   final Set<String> identifiedLinks = Sets.newHashSet();
   final Stack<String> toVisit = new Stack<String>();
   final String base = "http://localhost <http://localhost> :" +
getPort();
 
   final String startPage = "/index.html";
   identifiedLinks.add(startPage);
   toVisit.push(startPage);
 
   int count = 0;
   // do not check links that CONTAIN the following strings
   final HashSet<String> forbidden = Sets.newHashSet("delete",
"inventorySummaries.html"
   while (!toVisit.isEmpty()) {
   count++;
   gotoPage(toVisit.pop());
   grabLinksInPage(identifiedLinks, toVisit, base, forbidden);
   }
 }
 
 private void grabLinksInPage(final Set<String> identifiedLinks, final
Stack<String> toVisit, final String base, final Set<String> avoid) {
   final List<IElement> elementsByXPath = getElementsByXPath("//a");
 
   for (final IElement ie : elementsByXPath) {
   final String href = ie.getAttribute("href");
 
   final String linkForDup = useDiffPagesOnly && href.indexOf("?") > 0 ?
href.substring(0, href.indexOf("?")) : href;
 
   if (StringUtils.isNotBlank(href) && (href.startsWith(base) ||
!href.startsWith("http://";)) && !identifiedLinks.contains(linkForDup)) {
     boolean shouldInclude = true;
 
     for (final String toAvoid : avoid) {
       if (href.contains(toAvoid)) {
       shouldInclude = false;
       break;
       }
     }
 
     identifiedLinks.add(linkForDup);
     if (shouldInclude) {
       toVisit.push(href);
     }
   }
   }
 }

Benoit

 
________________________________

From: Jevon Wright [mailto:je...@jevon.org] 
Sent: 21 April 2010 00:16
To: Usage problems for JWebUnit
Subject: Re: [JWebUnit-users] How to get all links in current page


Hi Benoit,

Interesting question - I am sure you could do something like that with
JWebUnit (through xpath, etc), but I imagine using a piece of software
developed specifically for dumping a site would be better, unless you
want to verify some unit tests across an entire site?

Software like wget comes to mind.

Cheers
Jevon


On Tue, Apr 20, 2010 at 10:44 PM, Xhenseval, Benoit
<benoit.xhense...@credit-suisse.com> wrote:


        Hi All
        
        I'm new to JWebUnit.
        
        I'm trying to develop the most obvious check... A spider that
would
        follow all links within a given domain.
        
        Is there support for such a thing?
        
        If not, how could I get all links from a given page?
        
        I'm not using Selenium.
        
        Thanks a lot
        
        Benoit
        
        
========================================================================
=======
        Please access the attached hyperlink for an important electronic
communications disclaimer:
        http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
        
========================================================================
=======
        
        
        
------------------------------------------------------------------------
------
        Download Intel&#174; Parallel Studio Eval
        Try the new software tools for yourself. Speed compiling, find
bugs
        proactively, and fine-tune applications for parallel
performance.
        See why Intel Parallel Studio got high marks during beta.
        http://p.sf.net/sfu/intel-sw-dev
        _______________________________________________
        JWebUnit-users mailing list
        JWebUnit-users@lists.sourceforge.net
        https://lists.sourceforge.net/lists/listinfo/jwebunit-users
        



=============================================================================== 
Please access the attached hyperlink for an important electronic communications 
disclaimer: 
http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html 
=============================================================================== 

------------------------------------------------------------------------------
_______________________________________________
JWebUnit-users mailing list
JWebUnit-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jwebunit-users

Reply via email to