wget www.theuniquepear.com saves the welcome.do page. So it seems to work. Btw. I would suggest you change your mapping from .do to .html or change your mapping to path-mapping and not extension: /unique/do/welcome instead of /unique/welcome.do. For better indexing change 'do' through something meaningfull: "decor", "clocks", "lamps" - whatever is most important in your page and you want to be found under. Maybe a combination of them, up to 5 'subfolders' are evaluated by google (rumors).
regards leon P.S. You may want simply to add n different mappings, one for each keyword, but beware of delivering completely equal content under different urls. This would be considered spam and you'll thrown out of the index. On 2/17/06, Scott Purcell <[EMAIL PROTECTED]> wrote: > I started the below thread last weekend, and upon suggestions, I have > changed some javascript redirects to get to my site, into some JSP > redirects, based upon user input earlier this week. > > In a nutshell, I am trying to make sure that robots can index my web > site. > My web site is a struts application, and is the default app. The way the > site is configured, it is the root app, and I configured the root app to > use welcome-file as a .jsp. So when the user hits the url > <www.theuniquepear.com> it goes to a jsp page, which then does a jsp > redirect to the www.theuniquepear.com/unique/welcome.do the way struts > is set up then finally to the jsp via the action. > > Due to my lack of robot understanding, if I use curl now, and just issue > > curl www.theuniquepear.com it shows nothing, and does not do the > redirect. > But if I hit curl -L www.theuniquepear.com all is good and it is what I > want the robots to read. > > I made the change last Monday or so, and each day I check my access log > and the only entry I see is the robots come in and get a 500 and they > are gone. When I google for my site, nothing shows up. > > Does anyone know if the robots follow the links like the curl -L or does > it just use something like curl and never indexes my site? Also, what is > really silly is that even this email will probably be found when I type > in my url. Currently if one types in 'the unique pear' in google, I see > all the threads I start for this subject, but the site is never to be > found ... not good for business. > > Any input would be appreciated. > > Thanks, > > > > -----Original Message----- > From: Mike Sabroff [mailto:[EMAIL PROTECTED] > Sent: Saturday, February 11, 2006 11:09 AM > To: Tomcat Users List > Subject: Re: Robots cannot read JSP? > > Scott, > Your assessment is incorrect! First off, curl doesn't read html pages, > it does a get or post to a url just as though you clicked it in your > browser (and a lot of other things you can do with curl). Second off, it > > is not the jsp that is the problem, it is the javascript as Tim said, > and the lack of links. > > Mike > > David Smith wrote: > > I doubt the problem is with curl not being able to read files other > > than .htm or .html. The problem is only browsers execute javascript. > > Think of curl or the search engines as a browser without javascript > > enabled. What would you get in IE or Firefox if you disabled > javascript? > > > > -- David > > > > Scott Purcell wrote: > >> Tim, > >> Thanks a lot for the info. I got to thinking, and tried invoking curl > >> from my box on the url, and see exactly what you saw. The js screwing > >> things up. > >> > >> So I decided to run curl on different pages, and I came to the > >> conclusion that only htm, or html pages show up via curl? > >> > >> Does anyone think that the robots are just like curl, and that they > can > >> only read HTML files? > >> > >> Thanks for all, I know this is a bit off topic ...and I hope I don't > >> hack anyone off. > >> > >> Thanks > >> Scott > >> > >> -----Original Message----- > >> From: Tim Funk [mailto:[EMAIL PROTECTED] Sent: Friday, February 10, > > >> 2006 8:50 PM > >> To: Tomcat Users List > >> Subject: Re: Access log to see where robots go. > >> > >> The problem is your home page, not robots.txt. When / is requested - > the > >> > >> following is served back, notice the javascript redirect: (the full > file > >> is below) > >> > >> ---- > >> function invokeWebApp() { > >> top.location.href = > >> "http://www.theuniquepear.com/unique/index.jsp"; > >> } > >> ---- > >> Search engines do not execute javascript are there are no links on > the > >> page so search engines have no where to go. (Except someone else's > >> site). > >> > >> As much as I detest SEO companies, you might find it helpful to > search > >> for one for some assistance. > >> > >> <html> > >> <head> > >> <head> > >> <title>The Unique Pear | Unique Home Decor & Accessories</title> > >> <meta name="description" content="The Unique Pear is > an > >> > >> online b outique specializing in home decor & > >> accessories. Products include clocks, candl es, > wall > >> > >> decor, garden, lighting, bath and more."> > >> <meta name="keywords" content="The Unique Pear Timework clocks, > >> lamps, lamp shades, candles, aroma, aroma > >> difuser, wall > >> decor, wall scounces, wrought iron, pitchers, > >> bookstands, > >> jaqua bath products, candleholders"> > >> <meta name="description" content=""> > >> <meta name="keywords" content=""> > >> </head> > >> <body bgcolor="#FFFFFF"> > >> > >> <script language = "javascript"> > >> //<!-- > >> function invokeWebApp() { > >> top.location.href = > >> "http://www.theuniquepear.com/unique/index.jsp"; > >> } > >> invokeWebApp(); > >> // --> > >> </script> > >> > >> hello > >> </body> > >> </html> > >> > >> -Tim > >> > >> Scott Purcell wrote: > >> > >>> I have had trouble getting search engines to see my site. I built it > >>> > >> with struts, and use some tags from the index.html page to get > business > >> logic, to finally get to my page. The url is > >> http://www.theuniquepear.com > >> > >>> Anyway, upon talking to some co-workers, they suggested I watch my > >>> > >> access log, so I can see what files they are indexing. I thought I > had > >> the access log turned on for the site, and see when someone hits my > web > >> site, but as far as the searchbots go, I only see this in my logs > daily. > >> > >>> $ cat localhost_access_log.2006-02-07.txt | less > >>> 67.15.16.30 - - [07/Feb/2006:03:44:55 -0600] "GET /robots.txt > >>> > >> HTTP/1.0" 404 985 > >> > >>> 67.15.16.30 - - [07/Feb/2006:03:46:21 -0600] "GET / HTTP/1.0" 200 > 844 > >>> 67.15.16.30 - - [07/Feb/2006:03:51:57 -0600] "GET /robots.txt > >>> > >> HTTP/1.0" 404 985 > >> > >>> 62.114.208.233 - - [07/Feb/2006:03:52:42 -0600] "GET > >>> > >> /unique/welcome.do?OVRAW=home%20decorating%20ideas&OVKEY=home > >> > >>> 62.114.208.233 - - [07/Feb/2006:03:52:44 -0600] "GET > >>> > >> /unique/includes/siteWide.css HTTP/1.1" 200 15402 > >> > >>> 62.114.208.233 - - [07/Feb/2006:03:52:44 -0600] "GET > >>> > >> /unique/images/header_pear.jpg HTTP/1.1" 200 11227 > >> > >>> I see the entry for robots.txt, but I have no idea where they are > >>> > >> going, or what they are doing. > >> > >>> I turned on access log like this in the server.xml like so: > >>> <Valve className="org.apache.catalina.valves.AccessLogValve" > >>> directory="logs" prefix="localhost_access_log." > >>> > >> suffix=".txt" > >> > >>> pattern="common" resolveHosts="false"/> > >>> > >>> And that is a snippet of the log from above. > >>> > >>> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: [EMAIL PROTECTED] > >> For additional commands, e-mail: [EMAIL PROTECTED] > >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: [EMAIL PROTECTED] > >> For additional commands, e-mail: [EMAIL PROTECTED] > >> > >> > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > -- > Mike Sabroff > Web Services Developer > [EMAIL PROTECTED] > 920-568-8379 > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]