Given that everything is in memory for that test (including the DB ->
HSQLDB), it is pretty fast.
We have < 100 different pages and that takes ~2 min.
If I do every possibilities (including the parameters), ~50,000 pages
and ~15 min which is no problem for a regular CI build.
 
By the way.... how could I STOP the debug from logging everything?  I am
sure it is something that I should do in log4j.xml but so far, I have
not managed...
 
Things like this:
 
2010-04-20 14:40:14,591 DEBUG [wire.header#?] (main:) >> "GET
/escape/indexData.html?id=.dMIBE00000PEU HTTP/1.1[\r][\n]"
...
2010-04-20 14:40:14,607 DEBUG [wire.content#?] (main:) << "<?xml
version="1.0" encoding="UTF-8" ?>[\r][\n]"

Thanks,
 
Benoit.

________________________________

From: Jevon Wright [mailto:je...@jevon.org] 
Sent: 21 April 2010 10:04
To: Usage problems for JWebUnit
Subject: Re: [JWebUnit-users] How to get all links in current page


Ah I see what you mean - yes, that's quite a useful check :-) Of course,
this only checks pages accessible through explicit <a> links, not with
any content provided through forms, etc. Thanks for sharing your code.

How long does it take to run across your site?

Cheers
Jevon


On Wed, Apr 21, 2010 at 9:00 PM, Xhenseval, Benoit
<benoit.xhense...@credit-suisse.com> wrote:


        Hi Jevon,
         
        Yes I was hoping to simply check every links of an application
whislt running integration tests with Maven.  The aim is to quickly
check that there are no exceptions thrown.
         
        I have come up with the following code, which may be useful to
others:
         
        private boolean useDiffPagesOnly = true; // only visit a page
once, i.e. not for different parameters
         
         public void testSpider() {
           final Set<String> identifiedLinks = Sets.newHashSet();
           final Stack<String> toVisit = new Stack<String>();
           final String base = "http://localhost <http://localhost> :" +
getPort();
         
           final String startPage = "/index.html";
           identifiedLinks.add(startPage);
           toVisit.push(startPage);
         
           int count = 0;
           // do not check links that CONTAIN the following strings
           final HashSet<String> forbidden = Sets.newHashSet("delete",
"inventorySummaries.html"
           while (!toVisit.isEmpty()) {
           count++;
           gotoPage(toVisit.pop());
           grabLinksInPage(identifiedLinks, toVisit, base, forbidden);
           }
         }
         
         private void grabLinksInPage(final Set<String> identifiedLinks,
final Stack<String> toVisit, final String base, final Set<String> avoid)
{
           final List<IElement> elementsByXPath =
getElementsByXPath("//a");
         
           for (final IElement ie : elementsByXPath) {
           final String href = ie.getAttribute("href");
         
           final String linkForDup = useDiffPagesOnly &&
href.indexOf("?") > 0 ? href.substring(0, href.indexOf("?")) : href;
         
           if (StringUtils.isNotBlank(href) && (href.startsWith(base) ||
!href.startsWith("http://";)) && !identifiedLinks.contains(linkForDup)) {
             boolean shouldInclude = true;
         
             for (final String toAvoid : avoid) {
               if (href.contains(toAvoid)) {
               shouldInclude = false;
               break;
               }
             }
         
             identifiedLinks.add(linkForDup);
             if (shouldInclude) {
               toVisit.push(href);
             }
           }
           }
         }
        
        Benoit
        
         
________________________________

        From: Jevon Wright [mailto:je...@jevon.org] 
        Sent: 21 April 2010 00:16
        To: Usage problems for JWebUnit
        Subject: Re: [JWebUnit-users] How to get all links in current
page
        
        
        Hi Benoit,
        
        Interesting question - I am sure you could do something like
that with JWebUnit (through xpath, etc), but I imagine using a piece of
software developed specifically for dumping a site would be better,
unless you want to verify some unit tests across an entire site?
        
        Software like wget comes to mind.
        
        Cheers
        Jevon
        
        
        On Tue, Apr 20, 2010 at 10:44 PM, Xhenseval, Benoit
<benoit.xhense...@credit-suisse.com> wrote:
        

                Hi All
                
                I'm new to JWebUnit.
                
                I'm trying to develop the most obvious check... A spider
that would
                follow all links within a given domain.
                
                Is there support for such a thing?
                
                If not, how could I get all links from a given page?
                
                I'm not using Selenium.
                
                Thanks a lot
                
                Benoit
                
        
========================================================================
=======
                Please access the attached hyperlink for an important
electronic communications disclaimer:
        
http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
        
========================================================================
=======
                
                
        
------------------------------------------------------------------------
------
                Download Intel&#174; Parallel Studio Eval
                Try the new software tools for yourself. Speed
compiling, find bugs
                proactively, and fine-tune applications for parallel
performance.
                See why Intel Parallel Studio got high marks during
beta.
                http://p.sf.net/sfu/intel-sw-dev
                _______________________________________________
                JWebUnit-users mailing list
                JWebUnit-users@lists.sourceforge.net
        
https://lists.sourceforge.net/lists/listinfo/jwebunit-users
                



        
========================================================================
======
        Please access the attached hyperlink for an important electronic
communications disclaimer:
        http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
        
========================================================================
======



        
------------------------------------------------------------------------
------
        
        _______________________________________________
        JWebUnit-users mailing list
        JWebUnit-users@lists.sourceforge.net
        https://lists.sourceforge.net/lists/listinfo/jwebunit-users
        
        



=============================================================================== 
Please access the attached hyperlink for an important electronic communications 
disclaimer: 
http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html 
=============================================================================== 

------------------------------------------------------------------------------
_______________________________________________
JWebUnit-users mailing list
JWebUnit-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jwebunit-users

Reply via email to