The problem is that Google can't tell what is important and what isn't because the older content has just as many if not more links pointing to it than the new stuff. Just getting rid of the old content is going to break links on the web which may or may not be a bad thing. Here are a few things that might help:
1. put no index on the pages that you don't want to see in the search results any more <meta name="robots" content="noindex" /> 2. Specify a canonical url for the old pages. This would suggest that Google use the new page instead of the old ones in the results. 3. More links to current documentation. If you have a website that is pointing at older documentation, updating the links (where appropriate) would help. 4. A few deeper links from front page. For example, if tapestry.apache.org had a link to http://tapestry.apache.org/current/tapestry-core/ref/ it would help boost the current component reference in the search results. Mark On Sun, May 29, 2011 at 1:11 AM, Howard Lewis Ship <[email protected]> wrote: > > Maybe we should just replace the old web sites with .htaccess files > that redirect back to the main page, http://tapestry.apache.org/ > > > > On Sat, May 28, 2011 at 8:44 PM, Kalle Korhonen > <[email protected]> wrote: > > Ah, yes I see. Agree on all you said, and we definitely want to get > > that custom search into the template as well. At least to me it seems > > that the Maven based documentation for 5.x get most of the hits. I > > don't think we need to worry about 3.x documentation too much. I'm > > afraid that bulk edit with specific content links may not work very > > well as the documentation structure has changed. Simply adding the > > same link to the (root of) latest documentation on every existing page > > might increase the visibility of the wiki-based documentation in the > > search rankings though. I'm not a PMC member, but personally, I'd just > > give you commit rights to make this simpler. Any PMC member want to > > propose Bob as committer? > > > > Kalle > > > > > > On Sat, May 28, 2011 at 4:25 AM, Bob Harner <[email protected]> wrote: > >> Well, I did create http://tapestry.apache.org/search.html a few months > >> back, and it works much better than the general Google search. We > >> still need to figure out how to integrate it or something like it into > >> the site. That involves working with the template that I don't have > >> write-access to. > >> > >> Anyway, most people will still tend to use the standard Google search. > >> > >> On Fri, May 27, 2011 at 10:43 PM, Kalle Korhonen > >> <[email protected]> wrote: > >>> Agree, I'll help. I think one decent solution is a Google Custom > >>> Search. There was a previous effort underway, but I don't know what > >>> happened to it. If we could just properly search our own > >>> documentation, that would already be a huge improvement. > >>> > >>> Kalle > >>> > >>> > >>> On Fri, May 27, 2011 at 6:52 PM, Bob Harner <[email protected]> wrote: > >>>> Most of the time when I use Google to search for Tapestry topics, the > >>>> results are truly bad, because they are obscured by outdated > >>>> documentation for Tapestry 4 and older versions of Tapestry 5. This > >>>> makes Tapestry documentation seem much worse than it really is. (I > >>>> happen to think the newer stuff is pretty good.) > >>>> > >>>> The root problem is that Tapestry's long history of documentation > >>>> versions makes it hard for Google to tell which version is the best. > >>>> For example, searching for "tapestry component parameters" (without > >>>> quotes) results in: > >>>> > >>>> 1) http://tapestry.apache.org/tapestry5/guide/parameters.html > >>>> 2) http://tapestry.apache.org/tapestry4/UsersGuide/components.html > >>>> 3) http://tapestry.apache.org/tapestry5.1/guide/coercion.html > >>>> 4) > >>>> http://tapestry.formos.com/nightly/tapestry5/tapestry-component-report/ > >>>> > >>>> ...and hundreds of other links that are relevant but sub-optimal. > >>>> > >>>> The true best page is really > >>>> http://tapestry.apache.org/component-parameters.html -- but I couldn't > >>>> find that page in any of the top 200 results. And other search terms > >>>> are similarly disappointing. > >>>> > >>>> What's the solution? I propose doing the following: > >>>> > >>>> 1) Bulk edit or republish old 3.x and 4.x documentation pages to add a > >>>> prominent banner added at the top pointing to the corresponding page > >>>> in the newest documentation. The old content would remain in the > >>>> pages. > >>>> > >>>> 2) Bulk edit or republish old 5.x documentation with all text REMOVED > >>>> and a prominent banner added at the top pointing to the corresponding > >>>> page in the newest documentation. > >>>> > >>>> 3) Finding a way to tell Google what older pages are "archived" and > >>>> "low priority" and what new ones are "high priority". I guess a > >>>> Sitemap > >>>> (http://www.google.com/support/webmasters/bin/answer.py?answer=183668) > >>>> can do that. > >>>> > >>>> I'm willing to work on these, though ultimately I'll need a > >>>> committer's assistance for #1 and #2. > >>>> > >>>> What do you all think? Any other ideas? > >>>> > >>>> --------------------------------------------------------------------- > >>>> To unsubscribe, e-mail: [email protected] > >>>> For additional commands, e-mail: [email protected] > >>>> > >>>> > >>> > >>> --------------------------------------------------------------------- > >>> To unsubscribe, e-mail: [email protected] > >>> For additional commands, e-mail: [email protected] > >>> > >>> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: [email protected] > >> For additional commands, e-mail: [email protected] > >> > >> > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [email protected] > > For additional commands, e-mail: [email protected] > > > > > > > > -- > Howard M. Lewis Ship > > Creator of Apache Tapestry > > The source for Tapestry training, mentoring and support. Contact me to > learn how I can get you up and productive in Tapestry fast! > > (971) 678-5210 > http://howardlewisship.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
