RE: [WSG] [OT] Google search/index/webmaster help
-Original Message- From: li...@webstandardsgroup.org [mailto:li...@webstandardsgroup.org] On Behalf Of Philippe Wittenbergh Sent: 01 November 2009 23:05 To: wsg@webstandardsgroup.org Subject: Re: [WSG] [OT] Google search/index/webmaster help Because that file is being served as 'text/html' instead of 'text/xml' as it should. That is server misconfiguration. I agree that this is technically incorrect, but hardly unusual. I'm not surprised Googlebot doesn't pick it up. I would be absolutely flabbergasted if Google ignored it purely because of that. Google has a strong history of being pragmatic; they _want_ to use this file; why would you expect them to ignore a file with the right name, the right kind of content, in the right place? As an aside, how many robots.txt do you think get served up as text/html ? Regards, Mike Mike Brockington Web Development Specialist www.calcResult.com www.stephanieBlakey.me.uk www.edinburgh.gov.uk This message does not reflect the opinions of any entity other than the author alone. *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
Re: [WSG] [OT] Google search/index/webmaster help
On Nov 2, 2009, at 6:58 AM, Adam Smith wrote: Swami; I'll hazard a guess here and assume you're using Firefox; and you've done what I did and gone tohttp://maps.unimelb.edu.au/sitemap.xml , seen a mass of test on screen Because that file is being served as 'text/html' instead of 'text/xml' as it should. That is server misconfiguration. I'm not surprised Googlebot doesn't pick it up. Safari shows the same issue. Philippe --- Philippe Wittenbergh http://l-c-n.com/ *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
Re: [WSG] [OT] Google search/index/webmaster help
Because that file is being served as 'text/html' instead of 'text/xml' as it should. That is server misconfiguration. I'm not surprised Googlebot doesn't pick it up. yes, quite right, unfortunately, I don't think I can get the CMS to serve it correctly as xml, however google digested it happily enough - just failed to spider all URLs - something which I now know is normal. The only puzzle I still have is with the search results in our Custom Search Engine (still off topic!) but why would the public search return a different amount to the custom search? I have to admit, after 5 months of no change, this week it's gone from 1 result to 21 - go figure! Thanks again to all who replied. It's just reinforced to me that if you want an internal search engine that really works and is controllable, that leaving it up to the magic donkeys at google is really not an option. Still trying to convince our fine institution of that ;-) -- Andrew Harris and...@woowoowoo.com http://www.woowoowoo.com ~~~ * ~~~ *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
Re: [WSG] [OT] Google search/index/webmaster help
How I love this community! I haven't solved my problems yet, but based on the comments and ideas I've gathered in the past few days, the site has improved substantially. This latest comment from Philippe... On Mon, Nov 2, 2009 at 10:04 AM, Philippe Wittenbergh e...@l-c-n.com wrote: Because that file is being served as 'text/html' instead of 'text/xml' as it should. That is server misconfiguration. I dismissed at first, thinking our CMS wouldn't allow me to tweak such fundamental settings, but it led me into the bowels of the support forums where I dredged up the little slice of code I needed. Now, the sitemap.xml as well as the kml and gpx feeds are all served correctly as text/xml - did I say how I love this community? - and I've a grudging respect for MySource Matrix too! -- Andrew Harris and...@woowoowoo.com http://www.woowoowoo.com ~~~ * ~~~ *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
Re: [WSG] [OT] Google search/index/webmaster help
Sorry Hassan! It would seem it's been changed. Andrew's been beavering away, as one does. His original XML file I downloaded from the same URI as you did: http://maps.unimelb.edu.au/sitemap.xml, right onto my desktop (it's still there), and I believe I probably did that a number of hours before you looked. Sorry for the confusion. Keep breathing. Swami :) www.blueskyzen.com/design On Sat, Oct 31, 2009 at 4:35 AM, Hassan Schroeder has...@webtuitive.comwrote: Swami Neelamber wrote: I'm not totally sure about that htmlhead you've used top and bottom of your *sitemap.xml* file? Don't know what you're looking at but there are no such tags in the document at http://maps.unimelb.edu.au/sitemap.xml -- Hassan Schroeder - has...@webtuitive.com webtuitive design === (+1) 408-621-3445 === http://webtuitive.com twitter: @hassan dream. code. *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org *** *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
Re: [WSG] [OT] Google search/index/webmaster help
ahh - no. I did change some stuff on the site, but not the xml file - I suspect whatever you were looking at it with the first time had to put the html tags around it just to make sense of it. On Sat, Oct 31, 2009 at 10:00 AM, Swami Neelamber neelam...@gmail.com wrote: Sorry Hassan! It would seem it's been changed. Andrew's been beavering away, as one does. His original XML file I downloaded from the same URI as you did: http://maps.unimelb.edu.au/sitemap.xml, right onto my desktop (it's still there), and I believe I probably did that a number of hours before you looked. Sorry for the confusion. Keep breathing. Swami :) www.blueskyzen.com/design On Sat, Oct 31, 2009 at 4:35 AM, Hassan Schroeder has...@webtuitive.com wrote: Swami Neelamber wrote: I'm not totally sure about that htmlhead you've used top and bottom of your *sitemap.xml* file? Don't know what you're looking at but there are no such tags in the document at http://maps.unimelb.edu.au/sitemap.xml -- Hassan Schroeder - has...@webtuitive.com webtuitive design === (+1) 408-621-3445 === http://webtuitive.com twitter: @hassan dream. code. *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org *** *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org *** -- Andrew Harris and...@woowoowoo.com http://www.woowoowoo.com ~~~ * ~~~ *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
Re: [WSG] [OT] Google search/index/webmaster help
Craig, OK - that's a really interesting comment. I had, as far as I knew, used the right formatting, the sitemap validates as XML and Google's webmaster tools accepted it as a valid feed (after a few tweaks!) I followed this document, which I understand is the definitive source. http://www.sitemaps.org/protocol.php and my sitemap looks pretty much like that - apart from a couple of whitespace discrepancies. The fact that it worked for some of the URLs makes me think it's not a problem with the sitemap, but it's all interesting stuff. Thanks for taking the time to reply. On Fri, Oct 30, 2009 at 3:24 PM, Craig Jones cr...@designawebnoosa.com.auwrote: Hi Andrew, This is my firts time trying to help... It doesn't appear that your sitemap is written in xml The sitemap should look like this urlset xsi:schemaLocation=http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd; url locwww.unimelb.edu.au/campuses/maps.html/loc /url url locwww.unimelb.edu.au/campuses/maps2.html/loc /url /urlset then submit you new sitemap in google webmaster tools Goodluck Craig Andrew Harris wrote: Yes, I know it's off topic, but I really need a hand with a mystifying problem. I've tried the google forums, but have received no replies. If there are any listers who understand the free Google Custom Search Engine, webmaster tools, sitemaps and indexing problems, then I'd really appreciate you contacting me directly in regards to some problems I'm having. Just to give an idea of my quandary...http://maps.unimelb.edu.au/http://maps.unimelb.edu.au/sitemap.xml (100+ URLs submitted 5 months ago)http://www.google.com/search?q=site:maps.unimelb.edu.au (34 results = pathetic!)http://go.unimelb.edu.au/6t6 (1 result = totally pathetic!) Hopefully, it's nothing completely bleeding obvious that will humiliate me in front of my peers ;-) -- *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org *** -- Andrew Harris and...@woowoowoo.com http://www.woowoowoo.com ~~~ * ~~~ *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
Re: [WSG] [OT] Google search/index/webmaster help
Bother! that last reply was supposed to be off list! Oh well, the discussion had got around to web standards by that point, so it's fair game. ...and it's a friday afternoon, cut me some slack!! Thanks to all those who have replied off list. By way of reporting back to the list, I'll say... 1) Sitemaps are not the magic fix I thought they were. 2) Inbound links and organic indexing are vital. 3) My map pages are pretty short on text - google likes text. One thing that no-one picked up on was that I still haven't inserted some common metadata tags - I know they say google doesn't look at the metadata tags, but it makes me wonder. Funny how asking your peers to check your work suddenly makes you aware of basic things you'd missed... yes, my pages weren't valid - but they are now!!! ;-p -- Andrew Harris and...@woowoowoo.com http://www.woowoowoo.com ~~~ * ~~~ *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***