Re: [WSG] Strange character encoding issue

2008-11-19 Thread Nikita The Spider The Spider
On Wed, Nov 19, 2008 at 11:05 AM, James Jeffery
[EMAIL PROTECTED] wrote:
 Never had a problem with character encodings on web pages, but since I
 reinstalled the OS on my iMac I have had an issue.

 Some of my characters, especially when using ' seem to mess up. This is the
 page, content and layout are simple as it's for a uni assignment:
 http://mi-linux.wlv.ac.uk/~0802390/overview.html

 Check out the overview.html page, and notice the issues. There is one
 noticeable in the overview page ‘SOAP’

Your HTTP header declares the encoding to be ISO-8859-1 while the HTML
(and presumably your text editor) think it is UTF-8. The HTTP header
trumps all other sources of encoding information. If you can't change
or silence this header on your server, you'll need to save your pages
as ISO-8859-1.

You might find this article about encoding sources interesting:
http://NikitaTheSpider.com/articles/EncodingDivination.html

-- 
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***



Re: [WSG] Strange character encoding issue

2008-11-19 Thread Nikita The Spider The Spider
On Wed, Nov 19, 2008 at 11:32 AM, James Jeffery
[EMAIL PROTECTED] wrote:
 I don't own the server.

You don't need to own the server to be able to alter its behavior.
Many (most?) ISPs allow you to customize aspects of your site.

 Anyway. I saved as ISO-8859-1, and it works on windows now but not on Mac.
 Pulling my hair out at this issue.

We might be able to help if you're more specific. Windows is an OS;
what *browser* are you using? What does that browser tell you it
thinks the page's encoding is?

Some of the characters you used (like the curly quotes) might not be
in the ISO-8859-1 character set. Browser makers are aware of this, and
even if the encoding is declared as  ISO-8859-1, the browser will
treat the encoding as Win-1252 which is an ISO-8859-1 superset.
Wikipedia can tell you the difference.


 On Wed, Nov 19, 2008 at 4:23 PM, David Dorward [EMAIL PROTECTED] wrote:

 James Jeffery wrote:
  Never had a problem with character encodings on web pages, but since I
  reinstalled the OS on my iMac I have had an issue.

 Your server says:

  Content-Type: text/html; charset=ISO-8859-1

 But the data is UTF-8.

 --
 David Dorward   http://dorward.me.uk/


 ***
 List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
 Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
 Help: [EMAIL PROTECTED]
 ***



 ***
 List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
 Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
 Help: [EMAIL PROTECTED]
 ***



-- 
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***



Re: [WSG] Question about presenting numeric percentages and accessibility.

2008-10-15 Thread Nikita The Spider The Spider
On Wed, Oct 15, 2008 at 10:01 PM, John Unsworth [EMAIL PROTECTED] wrote:
 Hi all,
 Just a quick question. I'm writing up a website for a simple brochure
 site, and the copy I'm provided with refers to something 1/3 of
 total or colour 2/3 of natural and so on. And it just occured to
 me, would Number Slash Number (ie; 1/2) cause any issue in regards
 accessibility, be it screen readers or poor reading or math skills
 (the correct term for this alludes me for the moment, I'm thinking
 dyslexia, but not sure that correctly accounts for all potential
 users). As such I wondered if the abbr tag might be appropriate, or
 if anyone has a better, more suitable sugestion?

Why not enter the Unicode fraction characters directly? It's not easy
to enter them from they keyboard, so what I usually do is enter them
as references (e.g. frac34; or #8532; as Todd B. suggested) and then
copy  paste them from the browser window into my text editor. It
makes the source HTML much easier to read.

Here's an example. The section called Encodings has a 2/3 in it, and
there's some other fractions floating around in there too:
http://NikitaTheSpider.com/articles/ByTheNumbers/fall2008.html

HTH

-- 
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***



Re: [WSG] Looking to source a JAWS version

2008-09-24 Thread Nikita The Spider The Spider
On Wed, Sep 24, 2008 at 4:13 AM, David Dorward [EMAIL PROTECTED] wrote:
 Thierry Koblentz wrote:
 http://www.freedomscientific.com/fs_downloads/jaws.asp

 I don't know about the demo version on that page, but they used to offer a
 full version that would work for 30 minutes at a time (you needed to reboot
 the computer after 30 minutes if you wanted to use it again).

 With, I believe, a license that explicitly forbids using it for the
 purposes of testing websites for screen reader compatibility.

It's true, or at least it was when I looked at the license about a
year ago. We switched to using Window-Eyes for testing for just this
reason.


-- 
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***



Re: [WSG] Encoding odities

2008-07-13 Thread Nikita The Spider The Spider
On Sun, Jul 13, 2008 at 2:49 PM, Mordechai Peller [EMAIL PROTECTED] wrote:
 David Hucklesby wrote:

 FWIW - The META content-type is only relevant to pages read from
 a local file-- for example, when someone saves your page to disk.

 Not true. I recently had some non-local UTF-8 files where some special
 characters weren't displaying properly in IE6. When I added the missing meta
 tag, the problem was solved.

Mordechai, you're correct. The encoding declared in the META tag *can*
be relevant, although it is trumped by what (if anything) is specified
in the HTTP header. When loading pages directly from disk, obviously
there's no HTTP headers involved.

You might find this article interesting:
http://NikitaTheSpider.com/articles/EncodingDivination.html


-- 
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***



Re: [WSG] Encoding odities

2008-07-10 Thread Nikita The Spider The Spider
On Thu, Jul 10, 2008 at 8:27 AM, Barney Carroll
[EMAIL PROTECTED] wrote:
 Hello all,

 I've got a problem with character set encoding I'd like to rectify. I use
 UTF-8 as a matter of convenience and ideology, and don't believe it should
 be that much of a problem. My editor (Notepad++) is set to create new files
 in UTF-8 without a byte order mark, but when I retrieve files from my server
 it tells me that they're ANSI.

Does ANSI means US-ASCII? The most popular single-byte encodings
(ISO-8859-X, Win-1252) and UTF-8 are supersets of US-ASCII, so a
US-ASCII file is also valid UTF-8 (and ISO-8859-X and Win-1252) all at
the same time. It's pretty easy to write English-language pages that
are 100% pure US-ASCII, so this might be your situation. Notepad++ has
saved the file as UTF-8, but in this situation that doesn't look any
different from US-ASCII (i.e. ANSI).

Here's an ASCII chart. There are a lot of things floating around out
there that claim to be extended ASCII, Microsoft ASCII, Updated
ASCII, etc. None of them are official. ASCII ends at character 127,
end of story.
http://www.jimprice.com/jim-asc.shtml

Here's a list of valid charset names:
http://www.iana.org/assignments/character-sets


 I ran an automatic W3C validation of my markup just a second ago after
 making some edits and it warns me that no character set encoding was
 specified (even though the first tag in my heads is meta
 name=content-type content=text/html; charset=UTF-8).

For us to figure out why that is, you'll need to share a URL with us.


-- 
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***



Re: [WSG] XHTML 1.1 CSS3 - Is it worth using right now?

2008-05-13 Thread Nikita The Spider The Spider
On Mon, May 12, 2008 at 10:57 PM, XStandard Vlad Alexander
[EMAIL PROTECTED] wrote:
 HTH wrote:
  ...server has to do content negotiation in order to send

 text/html with one doctype (HTML or XHTML 1.0) to IE users and
  application/xhtml+xml/XHTML 1.1 to everyone else. That means
  you're generating two copies of all of your content
  Assuming your are not writing static pages, you only need to generate one 
 copy of content in XHTML 1.1 format and then serve it as any version of HTML 
 as you like.

I'm not sure what you mean -- I understand the XHTML 1.1 part, but
what do you mean then by serve it as any version of HTML? Are you
talking about putting an HTML doctype on XHTML 1.1-formatted code, or
serving XHTML 1.1 with the text/html media type, or something else?


  HTH wrote:
   Furthermore, content negotiation itself is some work to
   get done correctly
  At most, maybe 10 lines of code. Please see:
  http://xhtml.com/en/content-negotiation/

My point exactly -- that code is not correct. It produces the wrong
result when presented with an Accept header of */* which is valid (see
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.1) and
indicates that the client can accept application/xhtml+xml.

The code is also wrong in that the Accept header can contain
preference indicators (q=...). It's valid for a client to indicate
that it accept both text/html and  application/xhtml+xml but prefers
the former. A straightforward substring search won't get the job done
correctly.

It's true that these are unusual cases and the consequences of getting
it wrong are minor (text/html sent instead of application/xhtml+xml).
But my point was that it is easy to make mistakes, even if you're
getting it right most of the time.

There was a recent discussion (pretty vocal, if I remember correctly)
on the W3 Validator list about the subject of content negotiation
involving people with a deeper understanding and appreciation of the
standards than me. You might find it interesting reading.

Cheers

-- 
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***



Re: [WSG] XHTML 1.1 CSS3 - Is it worth using right now?

2008-05-13 Thread Nikita The Spider The Spider
On Tue, May 13, 2008 at 3:17 PM, XStandard Vlad Alexander
[EMAIL PROTECTED] wrote:
 Hi Nikita,


   Are you talking about putting an HTML doctype on
   XHTML 1.1-formatted code
  Yes, but normally you would put XHTML 1.1 markup into an template written 
 for a different DOCTYPE as shown in this screen shot:

  
 http://xstandard.com/94E7EECB-E7CF-4122-A6AF-8F817AA53C78/html-layout-xhtml-content.gif

Hi Vlad,
OK, I see what you're trying to do, but the example you provided isn't
valid XHTML. If it were, the META tag would have to end in a / and
then it wouldn't be valid HTML anymore. In other words, it's a good
example of why you can't just change the doctype in order to switch
between HTML and XHTML. (In addition, the tags would have to be
lowercase if it were XHTML, but that's easy to remedy and also works
in HTML.)

The (X)HTML in the example and content negotiation code you've
suggested is probably adequate (from a practical standpoint) for many
Webmasters, but it isn't standards compliant. Given the name of this
list, that seems pretty significant.

Cheers



  Original Message 
  From: Nikita The Spider The Spider


 Date: 2008-05-13 8:43 AM
   On Mon, May 12, 2008 at 10:57 PM, XStandard Vlad Alexander
   [EMAIL PROTECTED] wrote:
   HTH wrote:
...server has to do content negotiation in order to send
  
   text/html with one doctype (HTML or XHTML 1.0) to IE users and
application/xhtml+xml/XHTML 1.1 to everyone else. That means
you're generating two copies of all of your content
Assuming your are not writing static pages, you only need to generate 
 one copy of content in XHTML 1.1 format and then serve it as any version of 
 HTML as you like.
  
   I'm not sure what you mean -- I understand the XHTML 1.1 part, but
   what do you mean then by serve it as any version of HTML? Are you
   talking about putting an HTML doctype on XHTML 1.1-formatted code, or
   serving XHTML 1.1 with the text/html media type, or something else?
  
  
HTH wrote:
 Furthermore, content negotiation itself is some work to
 get done correctly
At most, maybe 10 lines of code. Please see:
http://xhtml.com/en/content-negotiation/
  
   My point exactly -- that code is not correct. It produces the wrong
   result when presented with an Accept header of */* which is valid (see
   http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.1) and
   indicates that the client can accept application/xhtml+xml.
  
   The code is also wrong in that the Accept header can contain
   preference indicators (q=...). It's valid for a client to indicate
   that it accept both text/html and  application/xhtml+xml but prefers
   the former. A straightforward substring search won't get the job done
   correctly.
  
   It's true that these are unusual cases and the consequences of getting
   it wrong are minor (text/html sent instead of application/xhtml+xml).
   But my point was that it is easy to make mistakes, even if you're
   getting it right most of the time.
  
   There was a recent discussion (pretty vocal, if I remember correctly)
   on the W3 Validator list about the subject of content negotiation
   involving people with a deeper understanding and appreciation of the
   standards than me. You might find it interesting reading.
  
   Cheers
  



-- 
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***



Re: [WSG] XHTML 1.1 CSS3 - Is it worth using right now?

2008-05-13 Thread Nikita The Spider The Spider
On Tue, May 13, 2008 at 10:02 PM, XStandard Vlad Alexander
[EMAIL PROTECTED] wrote:

  Nikita wrote:
   the META tag would have to end in a / and then it
   wouldn't be valid HTML anymore.
  Sure it would. It may not be in the spec but it's a de facto standard.
 Even the W3C validator will accept it as valid HTML.

I encourage you to try that with the W3C validator. You will not get
the result you expect.


   Original Message 
  From: Nikita The Spider The Spider
  Date: 2008-05-13 7:49 PM
   On Tue, May 13, 2008 at 3:17 PM, XStandard Vlad Alexander
   [EMAIL PROTECTED] wrote:
   Hi Nikita,
  
  
 Are you talking about putting an HTML doctype on
 XHTML 1.1-formatted code
Yes, but normally you would put XHTML 1.1 markup into an template 
 written for a different DOCTYPE as shown in this screen shot:
  

 http://xstandard.com/94E7EECB-E7CF-4122-A6AF-8F817AA53C78/html-layout-xhtml-content.gif
  
   Hi Vlad,
   OK, I see what you're trying to do, but the example you provided isn't
   valid XHTML. If it were, the META tag would have to end in a / and
   then it wouldn't be valid HTML anymore. In other words, it's a good
   example of why you can't just change the doctype in order to switch
   between HTML and XHTML. (In addition, the tags would have to be
   lowercase if it were XHTML, but that's easy to remedy and also works
   in HTML.)
  
   The (X)HTML in the example and content negotiation code you've
   suggested is probably adequate (from a practical standpoint) for many
   Webmasters, but it isn't standards compliant. Given the name of this
   list, that seems pretty significant.
  
   Cheers
  
  
  
    Original Message 
From: Nikita The Spider The Spider
  
  
   Date: 2008-05-13 8:43 AM
 On Mon, May 12, 2008 at 10:57 PM, XStandard Vlad Alexander
 [EMAIL PROTECTED] wrote:
 HTH wrote:
  ...server has to do content negotiation in order to send

 text/html with one doctype (HTML or XHTML 1.0) to IE users and
  application/xhtml+xml/XHTML 1.1 to everyone else. That means
  you're generating two copies of all of your content
  Assuming your are not writing static pages, you only need to 
 generate one copy of content in XHTML 1.1 format and then serve it as any 
 version of HTML as you like.

 I'm not sure what you mean -- I understand the XHTML 1.1 part, but
 what do you mean then by serve it as any version of HTML? Are you
 talking about putting an HTML doctype on XHTML 1.1-formatted code, or
 serving XHTML 1.1 with the text/html media type, or something else?


  HTH wrote:
   Furthermore, content negotiation itself is some work to
   get done correctly
  At most, maybe 10 lines of code. Please see:
  http://xhtml.com/en/content-negotiation/

 My point exactly -- that code is not correct. It produces the wrong
 result when presented with an Accept header of */* which is valid (see
 http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.1) and
 indicates that the client can accept application/xhtml+xml.

 The code is also wrong in that the Accept header can contain
 preference indicators (q=...). It's valid for a client to indicate
 that it accept both text/html and  application/xhtml+xml but prefers
 the former. A straightforward substring search won't get the job done
 correctly.

 It's true that these are unusual cases and the consequences of getting
 it wrong are minor (text/html sent instead of application/xhtml+xml).
 But my point was that it is easy to make mistakes, even if you're
 getting it right most of the time.

 There was a recent discussion (pretty vocal, if I remember correctly)
 on the W3 Validator list about the subject of content negotiation
 involving people with a deeper understanding and appreciation of the
 standards than me. You might find it interesting reading.

 Cheers


   
  
  
  




  ***
  List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
  Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
  Help: [EMAIL PROTECTED]
  ***





-- 
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***



Re: [WSG] XHTML 1.1 CSS3 - Is it worth using right now?

2008-05-12 Thread Nikita The Spider The Spider
On Mon, May 12, 2008 at 4:42 PM, Simon [EMAIL PROTECTED] wrote:
 Hi,

  Does anyone use XHTML 1.1

Of the doctypes that my validator Nikita saw in one sample period,
just slightly over 2% were XHTML 1.1. It's worth noting that most, if
not all, were sent with the wrong media type.

http://NikitaTheSpider.com/articles/ByTheNumbers/#doctypes

 and does it provide any benefits?

Well, compared to what? HTML 4.01 Strict, XHTML 1.0 Transitional or
XHTML 1.0 Strict?

  Is there a reason why not many sites adopt this Doctype and is there any
  point using right now if your site is 1.0 Strict?

One big impediment to using XHTML 1.1 is that it must be sent with the
application/xhtml+xml media type which makes IE6 choke. That implies
that the server has to do content negotiation in order to send
text/html with one doctype (HTML or XHTML 1.0) to IE users and
application/xhtml+xml/XHTML 1.1 to everyone else. That means you're
generating two copies of all of your content unless you're willing to
refuse IE users. Does this sound appealing yet?

Furthermore, content negotiation itself is some work to get done
correctly, even ignoring the cost of generating both two versions of
one's content.

Given the extra work required to support XHTML 1.1, there would have
to be some pretty darn compelling reasons to use it, and those reasons
just aren't there for most people. There's quite enough people who
question the use of XHTML 1.0 over HTML (I'm one of them), let alone
XHTML 1.1.

About XHTML and media types:
http://www.w3.org/TR/xhtml-media-types/#summary

HTH


-- 
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***



Re: [WSG] transitional vs. strict

2008-04-30 Thread Nikita The Spider The Spider
On Tue, Apr 29, 2008 at 11:06 PM, Hassan Schroeder
[EMAIL PROTECTED] wrote:


  One argument against the use of transitional doctypes is that they're
  now more than eight years old which makes them about half as old as
  the Web itself. Do you want to base your site on what was status quo
  half a Web lifetime ago?
 

  Uh, aren't the transitional doctypes pretty much, er, well, exactly,
  as old as their corresponding strict doctypes? :-)


True enough! I said that was a potential argument; I didn't say it was
a *good* argument. =)

In all seriousness, it sounds like the OP's boss is unconvinced by
rational arguments, so why not try some irrational ones?


-- 
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***



Re: [WSG] transitional vs. strict

2008-04-29 Thread Nikita The Spider The Spider
On Tue, Apr 29, 2008 at 2:48 PM, Andrew Maben [EMAIL PROTECTED] wrote:

 I'm finding myself having to justify my work methods to a boss who has
 almost zero interest in usability, accessibility or standards. (Though I
 have managed to get into the long-term plan: ...website that is compliant
 with W3C standards and Section 508...)

 One question that has been raised is if site X has pages that validate as
 transitional, why do you have to produce pages that validate as strict?

One argument against the use of transitional doctypes is that they're
now more than eight years old which makes them about half as old as
the Web itself. Do you want to base your site on what was status quo
half a Web lifetime ago?

Good luck

-- 
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***



Re: [WSG] PNG file sizes

2008-04-16 Thread Nikita The Spider The Spider
On Wed, Apr 16, 2008 at 1:56 AM, Ca Phun Ung [EMAIL PROTECTED] wrote:
 Mike Brown wrote:

 
  Rachel May wrote:
 
   I created the PNGs in Photoshop (CS3) and just wondering if there are
 any better tools or ways of saving the PNGs for smaller file size, while
 still retaining their high quality??
  
 
  http://www.ignite-it.co.uk/
 
  Best. Graphics. Optimiser. PlugIn.
 
 
  I use PNGGauntlet as an after process to optimize those PNGs.

  http://brh.numbera.com/software/pnggauntlet/

  Unfortunately it only supports Windows.

The Gimp (graphics editor) has PngCrush built into its save routine.
You could also run PngCrush separately, I guess, but I never have
because I always use The Gimp.

Also, PngQuant runs on *nix boxes (and Windows?) and allows you to
reduce the palette size of PNGs. Save a PNG as indexed and then tell
PngQuant to rerender the PNG with a fixed palette size (up to 256
entries, I think) and it will dither any colors that don't fit in the
palette. You can try with different palette sizes to see what tradeoff
of size/quality works for you.

Enjoy



-- 
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***



Re: [WSG] Character Encoding Mismatch

2008-04-06 Thread Nikita The Spider The Spider
On Sun, Apr 6, 2008 at 12:11 AM, David Hucklesby [EMAIL PROTECTED] wrote:
  On Fri, Apr 4, 2008 at 4:16 PM, Kristine Cummins
   [EMAIL PROTECTED] wrote:
  
   Can someone tell me how to fix this W3C warning – I'm new to 
 understanding this part.
   http://validator.w3.org/check?uri=http%3A%2F%2Fwww.beverlywilson.com%2F
  

  On Fri, 4 Apr 2008 20:15:19 -0400, Nikita The Spider replied:
   Kristine,
   If your server is already specifying the character set (a.k.a. encoding) 
 then you don't
   need to do so in your HTML. In fact, I'd recommend against doing so, ...

  The META tag is needed when serving the page from the hard drive -
  for example, when the page is saved for viewing later. (The hard drive
  does not send HTTP headers.)

That's a good point that I should have mentioned, and I'm glad you
brought it up. However, IMO this need is often overstated. Browsers
are pretty good at guessing the encoding when they need to. I wouldn't
rely on browsers guessing correctly for public pages, but I think the
clutter of having duplicate encoding declarations usually outweighs
the benefit.

Of course, ideally one looks at one's pages using a local Web server.
I think Windows  Linux come with one preinstalled and I know that OS
X does, so this should be within the reach of most folks.

Cheers

-- 
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***



Re: [WSG] Character Encoding Mismatch

2008-04-04 Thread Nikita The Spider The Spider
On Fri, Apr 4, 2008 at 4:16 PM, Kristine Cummins
[EMAIL PROTECTED] wrote:

 Can someone tell me how to fix this W3C warning – I'm new to understanding
 this part.
  http://validator.w3.org/check?uri=http%3A%2F%2Fwww.beverlywilson.com%2F

Kristine,
If your server is already specifying the character set (a.k.a.
encoding) then you don't need to do so in your HTML. In fact, I'd
recommend against doing so, and the problem you've experienced is
exactly why. If you specify the encoding in two (or more) places, they
can get out of synch. You might *think* you're specifying ISO-8859-1
because that's what your HTML META tag says, but if the server says
something else, that's what takes priority.

It's important to understand that the encoding tells browsers (and
other user agents, like Googlebot) how to interpret non-ASCII
characters in your page. It's a common mistake to think that these are
restricted to accented characters that we generally don't use in
English, but content pasted in from Microsoft Word (for instance) is
likely to contain non-ASCII as well. In other words, you might be
using them without realizing it. If you are, and you get the encoding
wrong, then what you see as quote marks (for instance) might look like
this to others: â€

Whatever tool you're using to save files should give you a choice of
which encoding/character set to use. You can use ISO-8859-1 to write
in English and most Western European languages. Since your Web server
is already identifying your pages as such, it might be a good choice.
Others have suggested UTF-8 which can represent anything under the
sun. That's great, but you'll have to find some way to cajole your
server into telling the world that your pages are UTF-8, not
ISO-8859-1. If you can't, you'll have to stick to the latter.

-- 
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***



Re: [WSG] META content-lang. declared but showing up different (for one person reporting)

2008-03-20 Thread Nikita The Spider The Spider
On Thu, Mar 20, 2008 at 3:17 PM, Kristine Cummins
[EMAIL PROTECTED] wrote:
 I launched a new site a few days ago and received a report that the site is
 showing in another language and/or foreign characters even though meta
 http-equiv=content-language content=en / is declared in the HEAD.

That might be useful to search engines for deciding what language your
content is in, but I don't think any browsers pay attention to
language  hints.


 The person reporting is using Safari (v. unknown) and I think on a Mac (not
 sure if this info is applicable). is an XHTML site and validated as such.

On my Mac using Safari 3.0.4 the site displays just fine.


  Thoughts appreciated - Thank you,

Without any more information, I'd guess the problem is specific to
that person's computer. It could be that the font specification of
palatino linotype is causing Safari to make some weird substitution
that's not working out, but that'd be a bug. Well, there's a bug
somewhere...

Good luck

-- 
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***



Re: [WSG] Colour Blindness Statistics

2007-11-11 Thread Nikita The Spider The Spider
On Nov 11, 2007 8:33 AM, Rahul Gonsalves [EMAIL PROTECTED] wrote:
 On 10-Nov-07, at 6:33 PM, Gunlaug Sørtun wrote:

  Rahul Gonsalves wrote:
  I'm searching for first-hand, authoritative statistics on colour
  blindness, for use in a formal, academic document. Would anyone be
  able to point me in the right direction?

Hi Rahul,
I've also seen the 8% figure frequently cited but, like you, have
never been able to find a reference for it. I'd like to know the
source(s) for this figure, because I'd like to know what group was
assessed. Is that U.S. males? North American males? North American
caucasian males? (If the study was done in the USA pre-1950, it
wouldn't be unusual to find that it ignored certain ethnic groups.) I
would be surprised to learn that the 8% figure is applicable worldwide
in all populations. Furthermore, how did the author(s) of the study(s)
define color vision deficiency? As I understand it, color vision
ability is on a spectrum ranging from tetrachromacy on the high end to
complete color blindness on the low end. Where's the cutoff point on
that line that defines deficiency?

Maybe these are not the right questions to ask. I'm not a color vision
expert, or a color expert or even a vision expert. My point is that
the 8% figure is too neat to seem real, and I'll bet the reality of
the situation is more nuanced and interesting.

For anyone interested in the subject, Oliver Sacks' _Island of the
Colorblind_ is an enjoyable read.

-- 
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***



Re: [WSG] Encoded mailto links

2007-10-19 Thread Nikita The Spider The Spider
On 10/19/07, Chris Knowles [EMAIL PROTECTED] wrote:
 I noticed this page also uses entity encoding. This is a solution I have
 used myself but the more I think about it the more I realise realise how
 ineffective it is really.

 take the following PHP code:

 // some page fetching function
 $html = fetchPage($url);

 // convert any entites in the page to plain text
 $html = html_entity_decode($html);

 now $html contains plain email addresses - with one line of code

 surely any harvester performs this operation first?

Hi Chris,
I often see the same argument about Javascript. That is, it is trivial
to embed a Javscript interpreter in an email address harvester, so
Javascript-protected email addresses are (or soon will be) vulnerable.
IMHO, arguments based on the cost of programming the harvester are
misguided.  Far more important is the CPU and memory cost of running
the harvester over the long term. For the harvester, both increased
throughput and increased intelligence imply increased addresses
harvested. But the former buys them much more simply in sheer numbers
-- most addresses are presented unprotected. And the latter (clever
harvesting, like running html_entity_decode) is only likely to harvest
the addresses of Net-savvy individuals who are the least likely (we
hope!) to respond to spam and phishing. One could even argue that a
clever harvester is counterproductive in that it will pollute its
database with the addresses of these Net-savvy individuals.

In short, I think harvesters download HTML pages and run a regex that
looks for '@' with text on either side. They care about getting as
many email addresses as possible as quickly as possible.

Just my $.02,

-- 
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***



Re: [WSG] Encoded mailto links

2007-10-18 Thread Nikita The Spider The Spider
On 10/18/07, Anders Nawroth [EMAIL PROTECTED] wrote:
 Hi!

 Nikita The Spider The Spider skrev:
  You might be interested in an experiment I ran that compared a few
  techniques for protecting one's email address from harvesting bots.
  The short answer: entity references worked very well

 I think the time span of your study is to short.

 I have used the method you used for äcklig, with mixed decimal and
 hexadecimal numeric entities. In about a year there was no spam, but
 somewhere at 1.5 years it started a little, and after 2 years there
 where 100+ spam/day.

Hej Anders,
That's very interesting, thanks for letting me know!

 So I think you just push the problem forward, which could be fine in
 some cases. But when a entity-decoding spam harvester finds the
 email-address, this will get listed in the same databases as all other
 emailaddresses. The more traffic your site has, the less difference the
 encoding will make.

I agree. I assumed (wrongly) that the 200+ days of the study was long
enough to get found by any harvesters that bothered to decode
entities. I'm not surprised to learn, however, that once the address
was exposed that it received an ever-increasing amount of spam. This
is consistent with my intuition and also what I observed.

 I think the htaccess-trick linked to by Dejan Kozina looks more
 promising. I have used this method, but abandoned it because of that
 some browser wouldn't send the mailto: address to the email client. But
 this was a few years ago, so this could possibly have changed.

This method looks promising to me too but I haven't had a chance to
test it yet.

-- 
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***



Re: [WSG] Encoded mailto links

2007-10-18 Thread Nikita The Spider The Spider
On 10/18/07, Anders Nawroth [EMAIL PROTECTED] wrote:
 Ray Leventhal skrev:
  As a matter of preference, I generally try to eliminate all mailto:
  links on any site I've been asked to work on.  In place, I use a contact
  form,

 Me too :-)

 But then you get form-post spam after a while ...

 I have begun to add a random token as a hidden field to forms (which
 is then checked on the server side, ensuring that a form can only be
 posted once), stopping bots that don't actually read the form every
 time. (I actually think most of them don't ... when I change the field
 names in forms the spam form posts continue to use the old ones for a
 long time.)

Mail forms (badly coded ones at least) are also vulnerable to SMTP
header injection. Once spammers find a vulnerable form, they can use
tools like curl to POST data to your mail processing script without
even loading the form. Challenge tokens like yours are a good solution
to this problem.

FYI, I ran into a nasty problem with Safari related to a challenge
string embedded in a logon form. Funnily enough, the solution is to
change the field name, as you're doing, Anders. Here's the problem and
a solution just to save someone else some headaches. Forgive me, all,
if this is getting too far afield for this list.

I generated the challenge string server-side and stuffed it into a
hidden input field with a name and id of challenge. When I submitted
the form I was sent to ProcessLogin.xxx. If I then hit the back button
(say because my password was rejected), Safari respected my cache
control directives and reloaded the page from the server, thus getting
a new challenge string. BUT, once the page was loaded, Safari
helpfully repopulated the input fields with the values they held prior
to form submission. This was nice in the case of the userid because it
saved me some typing, but it caused a problem with the challenge field
because Safari *overwrote it with the old challenge string*. The order
of events was as follows:
1) I hit back:
2) Safari gets a new copy of the page from the server. This creates a
new challenge string that embedded in the challenge field of the
page.
3) Safari (re)populates form fields with the values they held when the
form was submitted.
4) Safari executes the page's Javascript.

What was very confusing was that even View Source showed that the
page had the new challenge string but it was the old string that was
sent over the wire when I resubmitted the form. (Thank you Wireshark.)
Because of the order of 3  4, I could have solved this problem by
(re)setting the challenge field in the onLoad event. But that might
not fix the problem if there's another browser that has the same quirk
but executes those steps in a different order.

Instead I came up with the solution of using the challenge string as
the field id/name instead of its value. (The value is simply
challenge.) Safari's helpfulness leaves this field untouched since
it has a new name every time the page is loaded. The challenge string
still gets transmitted to ProcessLogin.xxx, I just have to find it by
looking through the form field names rather than their values.

Note that since the challenge string appears as a form control name
and form control names must begin with a letter per the HTML spec,
challenge strings have to start with a letter.

-- 
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***



Re: [WSG] Encoded mailto links

2007-10-17 Thread Nikita The Spider The Spider
On 10/17/07, Rick Lecoat [EMAIL PROTECTED] wrote:
 Hi,

 can anyone tell me what is the best accessible way (if any) of encoding
 a mailto: link? I want to make the email addresses on a site usable to
 screen reader users, but don't want them harvested by spambots.

Hi Rick,
You might be interested in an experiment I ran that compared a few
techniques for protecting one's email address from harvesting bots.
The short answer: entity references worked very well and do not need
Javascript to work. I am not an accessibility guru so I can't say how
friendly that is to screen readers and the like, but I reckon you have
plenty of people on this list who can inform you on that score.

http://NikitaTheSpider.com/articles/IngenReklamTack.html


Hope this helps

-- 
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***