>> What happened to:
>> <span class="money"><abbr class="currency" title="USD">$</abbr><span 
>> class="amount">5.99</span></span> 

I brought up the issue of the markup being large and complex to implement, and 
so we were discussing suggestions about how to potential streamline the markup.

-Mike

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Stephen Paul 
Weber
Sent: Wednesday, October 18, 2006 7:55 PM
To: Microformats Discuss
Subject: Re: title attribute and abbreviatedclassnames(Was:[uf-discuss]Currency 
Quickpoll: Preliminary results)

On 10/18/06, Mike Schinkel <[EMAIL PROTECTED]> wrote:
> >> <span class="money" title="USD">$5.99</span> I still think this is 
> >> bad semantics.  I don't think "USD" is really a title for "$5.99".
>
> I'll accept that.
>
> >> I'd propose this as an alternative:
> >> <abbr class="currency" title="USD">$</abbr>5.99

What happened to:

<span class="money"><abbr class="currency" title="USD">$</abbr><span 
class="amount">5.99</span></span>

Does that solve the whole problem and give us an extra usefulness at the same 
time (sorry for leaving a discussion and then just jumping back in again.  
Ignore me if I make no sense.)


>
> Okay... But is it a good idea to have a microformat as a prefix/suffix 
> instead of as a container? (general question - I hope it hasn't been 
> answered before...)
>
> If so, you'll also need (note the space after 35.66):
>
>         35.66 <abbr class="currency" title="DKK">kr</abbr>
>
> However, at the risk of being shot for heresy, has anyone considered allowing 
> this?
>
>         <abbr class="currency usd">$5.99</abbr>
>         <abbr class="currency dkk">35.66 kr</abbr>
>
> OR (something tells me this is even worse, but...):
>
>         <abbr class="money currency-usd">$5.99</abbr>
>         <abbr class="money currency-dkk">35.66 kr</abbr>
>
> I'm sure there is something just so wrong about this, but part of the reason 
> I'm on this list is to learn. So why not?
> Additionally, that would allow:
>
>         <abbr class="currency usd" title="5.99">Five Dollars and 99 
> cents</abbr>
>         <abbr class="currency dkk" title="35.66">Thirty Five point 66 
> Kroners</abbr>
>
> OR (for orthogonality):
>
>         <abbr class="money currency-usd" title="5.99">Five Dollars and 99 
> cents</abbr>
>         <abbr class="money currency-dkk" title="35.66">Thirty Five 
> point 66 Kroners</abbr>
>
> Just a thought...?
>
> -Mike
> P.S. *** I wish HTML had allowed "rel" for all tags including <span> 
> and <abbr>.  Or that we could just use it anyway and not get shot for 
> heresy. :)
>
>
> -----Original Message-----
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Scott Reynen
> Sent: Tuesday, October 17, 2006 10:30 AM
> To: Microformats Discuss
> Subject: Re: title attribute and abbreviated 
> classnames(Was:[uf-discuss]Currency Quickpoll: Preliminary results)
>
> I've starting replying to this a few times and become stuck in trying to fit 
> what I'm trying to say in the existing thread, so I'm just going to make some 
> points completely detached from the thread.
>
> First, I think Mike is right that the vast majority of published money 
> formats allow parsers to infer the distinction between the currency symbol 
> and the amount.  But this inference is already possible without a 
> microformat.  What's missing currently is:
>
> 1) an indication of which specific currency the symbol refers to.
> 2) the ability to markup money that doesn't fit this pattern
>
> I think it's best to either cover #1 or both, but I think it's too 
> complicated for publishers to provide what amounts to two distinct 
> microformats depending on a relatively complex pattern definition.
> That is, if we're going simple (only #1), I think we should go only simple, 
> and add the complex form to cover #2 later.
>
> So to cover #1, Mike has suggested:
>
> <span class="money" title="USD">$5.99</span>
>
> I still think this is bad semantics.  I don't think "USD" is really a title 
> for "$5.99".  I'd propose this as an alternative:
>
> <abbr class="currency" title="USD">$</abbr>5.99
>
> That is, markup the currency as currency, and treat any adjacent numbers as 
> the amount.
>
> To cover #2, I think we need an additional class="money" container, and a 
> class="amount" markup for the amount, and this could be added without 
> changing the parsing rules for the simple form I've suggested above.  I think 
> it would be best to start with either simple or complex and look at adding 
> the alternative after the microformat has gained some adoption.
>
> I don't think regular expressions should be included in the spec at all.  If 
> we're going to define amounts based on character ranges, we should describe 
> those character ranges in plain English because most people, even most tech 
> geeks, don't understand regular expressions at all.
>
> Peace,
> Scott
>
> On Oct 15, 2006, at 4:40 PM, Mike Schinkel wrote:
>
> > Scott:
> >
> > Thanks for the reply. If probably got confusing on my part; I will 
> > try to resolve that here if possible.
> >
> >>> I thought what you suggested was to allow for explicit 
> >>> differentiation between the currency identifier and the amount, 
> >>> but in certain cases where such differentiation can be made by 
> >>> matching a regular expression, allow for markup without explicit 
> >>> differentiation, leaving the differentiation implicitly to the 
> >>> parser to figure out.  For example, this would be valid:...
> >>> because it does follow the pattern, where everything that's not 
> >>> within a certain character group is considered a currency symbol 
> >>> (i.e. "$").  If this isn't what you're suggesting, then I'm not 
> >>> clear on what you're suggesting.
> >
> > You got it 100%.  But I did make a mistake in my example as I didn't 
> > mean to include alpha [A-Za-z]. It should just have been digits, 
> > periods, and commas [0-9\.\,]; everything else would be the currency 
> > symbol. I wasn't explicit about the following, but I will be now; no 
> > spaces (or &nbsp;) and the currency figure must be contiguous and 
> > either prefix or suffix a collection of digits.
> > Anythings else, and you need the complexity.
> >
> > Although I am not good with regex, I opened my regex book and my 
> > regex test and crafted this regex which I think identifies 100% of 
> > the special case to which I referred:
> >
> > ^([^0-9,\. ]*)([0-9]+[\.,]?[0-9]*)([^0-9,\. ]*)$
> >
> > In that regex, if $2 has a value, that's the amount.  If $1 OR $3 
> > has a value, then it's the symbol.  If it doesn't match, you *must* 
> > use the complex form.  (btw, this would also be really easy to write 
> > a recursive descent and/or a looping parser in javascript or other 
> > languages to parse this and we could publish those reference
> > implementations.)  We publish the regex (or a better written one) 
> > and the recursive descent parsers and say if it matches, you can use 
> > the simple form, otherwise the complex
> >
> > So the following could use the simple form:
> >
> >       The book is <span class="money" title="USD">$5.99</span>.
> >       In Brazil, the book would be <span class="money" title="BRL">R 
> > $12.84</span>.
> >       In Denmark, the price would be <span class="money"
> > title="DKK">35.66kr</span>.
> >
> > BTW, it wouldn't be hard to include spaces in the regex and it might 
> > be a good idea to go ahead and do that. If so, you can use the 
> > javascript replace() string function (or similar in other
> > languages) to first normalize the string to containing only real 
> > spaces and no &nbsp; like so:
> >
> >       s.replace(/&nbsp;/," ")
> >
> > where "s" is the innertext for the <span> and then use this regex on 
> > the result:
> >
> >       ^([^0-9,\. ]*)[ ]?([0-9]+[\.,]?[0-9]*)[ ]?([^0-9,\. ]*)$
> >
> > Where again $1 OR $3 will be the symbol and $2 will be the amount.
> > That would make these possible.
> >
> >       The book is <span class="money" title="USD">$&nbsp;5.99</span>.
> >       In Brazil, the book would be <span class="money" 
> > title="BRL">R$ 12.84</span>.
> >       In Denmark, the price would be <span class="money"
> > title="DKK">35.66 kr</span>.
> >
> > Yes is it a little more difficult for the person writing the parser, 
> > but there will be many times more orders of magnitude people writing 
> > the HTML than parsers and besides, we can provide a working regex 
> > and reference implementation functions that will be good for 99% of 
> > cases and just say "Here; use it!"
> >
> >>> http://regexlib.com/Search.aspx?k=currency
> >
> > I reviewed that and it appears there are most regex submitted that 
> > do essentially the same thing, correcting for something others 
> > didn't do (like handle leading zeros); did I misread?
> >
> >>> and I think it's only helping a slight majority that is quickly 
> >>> becoming a minority.  English language web pages only comprise 
> >>> about 55% of the web today, and that percent is quickly shrinking.  
> >>> So I'm publishing my currency in English, and you're trying to 
> >>> ease my implementation burden, so I don't have to explicitly 
> >>> define my currency symbol and parsers will just figure it out for me.
> >
> > I respectfully think it won't be in the minority; I think it will be 
> > the vast majority.  And it will work in others language besides 
> > English such as German, Spanish, French, Porteguese, Russia, Arabic, 
> > and so on; any that use digits + periods/commas for representing 
> > numbers.  It seems the only languages in any significant use that it 
> > doesn't work for is multibyte characters, which will require the 
> > complexity because, frankly, they are complex.
> >
> >>> I think this is already more confusing than just always 
> >>> identifying the individual parts, I think it's still likely to 
> >>> cause problems, ..
> >
> > Requiring identification of individual parts is less confusing in an 
> > abstract manner because you don't assume anything, but it is more 
> > difficult to learn because it requires everyone that implements it 
> > grok the entire spec to be able to use it.  By offering a simpler 
> > version, (I assert that) most people won't have to learn all the of 
> > the details because they will just use the simple version.  So it 
> > could be described as such:
> >
> >       The Money microformat has a simple version that applies in 
> > most cases, and a complex
> >       version for when you really need control or if you are using 
> > multibyte character sets. You
> >       can use the simple version, if the markup to which you want to 
> > add this microformat is
> >       limited to:
> >               1.) currency symbols (i.e. $, £, etc.),
> >               2.) spaces,
> >               3.) digits (i.e. 0-9), and
> >               3.) decimal seperators (comma "," or period ".")
> >
> >       For example:
> >
> >               The book is <span class="money" 
> > title="USD">$&nbsp;5.99</span>.
> >               In Brazil, the book would be <span class="money" 
> > title="BRL">R$ 12.84</span>.
> >               In Denmark, the price would be <span class="money"
> > title="DKK">35.66 kr</span>.
> >
> >       If however you want to markup money represented in much more 
> > complex ways, you'll need to
> >       use the more complex version, for example:
> >
> >               <p class="money">It'll cost you <abbr class="money"
> > title="50.00">fifty</abbr>
> >               <abbr class="amount" title="GBP">quid</abbr>, 
> > mate!</p>
> >
> >               <span class="money">Can you spare <abbr class="amount"
> > title="10">ten</abbr>
> >               <abbr class="currency" title="USD"><span 
> > class="unit">dollars</
> > span></abbr>?</span>
> >
> > By describing it this way, people who can use the simple version are 
> > never even required to drill down and learn the complex way.
> > This seems infinitely easier for the vast majority of people than 
> > for them to have to grok the entire spec right off the bat.
> > Frankly, when I first saw it I thought "It isn't really going to be 
> > this complex, is it?  I though the theme behind microformats were 
> > "Make the simpliest addition to HTML markup required." That's one of 
> > the reasons I was so drawn to the initiative.
> >
> > I actually think you'll end up with more invalid microformats if 
> > people are required to implement the current proposal because it is 
> > complex enough that it would be relatively easy for someone to get 
> > wrong. By having a simplier format, you'll minimize the chance those 
> > people get it wrong, and that those who do go to the more complex 
> > are more likely to really study it and get it write, and there will 
> > be less people overloading the experts by asking less questions 
> > about it (IMO).
> >
> > Question: Maybe we should vet this with typical web developers who 
> > are NOT involved with the microformat's initiative?  We could go out 
> > and ask workaday web site developers and web site maintainers their 
> > opinion on the subject of what is easier to comprehend?
> > Honestly, I'm giving my opinion but I could find out my opinion is 
> > in a tiny minority. Or vice versa.
> >
> > BTW, is there a plan to create a series of microformat validator 
> > pages where someone could go and enter a URL and have it extract all 
> > the data it found for a given microformat?  Without this, I think 
> > people will end up creating lots of pages with invalid microformat.  
> > And it would need to be done for *each* microformat.
> >
> >>> There are people from Yahoo! on this list, and Technorati's pretty 
> >>> big too, so they'd be good people to say whether or not they 
> >>> really care how long the class names are.
> > Yeah, I already said "Okay, concern addressed" in an earlier reply.
> >
> > Anyway, I'm hoping that my earlier mistake of including [A-Za-z] was 
> > the main reason you objected and that you'll agree with a small 
> > scope minimum form like I'm proposing.
> >
> > -Mike Schinkel
> > http://www.mikeschinkel.com/blog
> > http://www.welldesignedurls.org/
> >
> > P.S. On another note, another question just occurred to me: why are 
> > you using "money" and not "hMoney?"
> >
> >
> >
> > -----Original Message-----
> > From: [EMAIL PROTECTED]
> > [mailto:[EMAIL PROTECTED] On Behalf Of 
> > Scott Reynen
> > Sent: Saturday, October 14, 2006 10:39 PM
> > To: Microformats Discuss
> > Subject: Re: title attribute and abbreviated class names(Was:[uf- 
> > discuss]Currency Quickpoll: Preliminary results)
> >
> > On Oct 14, 2006, at 3:27 PM, Mike Schinkel wrote:
> >
> >>>> Your examples seem to leave a lot of ambiguity about what things 
> >>>> mean,
> >>
> >> I'm new to proposing microformats, so I clearly have a lot to 
> >> learn, but that said I don't see where what I was proposing was ambiguous.
> >> Can you give me explicit examples where allowing default 
> >> assumptions for the most common use cases will by necessity lead to 
> >> ambiguity?  It seems to me that either something will be specified 
> >> or if not it will default?  That seems non ambiguous to me. Am I 
> >> wrong?
> >
> > I'm not entirely sure we're talking about the same thing anymore, 
> > after reading this exchange:
> >
> > On Oct 14, 2006, at 3:55 PM, Mike Schinkel wrote:
> >
> >>>> That said, why not make the "symbol" markup optional?
> >>
> >> That's IMO is an additional good idea.
> >
> > I thought that was basically what you were advocating, but you 
> > called it an /additional/ good idea, so I'm not sure what it's an 
> > addition to.  I thought what you suggested was to allow for explicit 
> > differentiation between the currency identifier and the amount, but 
> > in certain cases where such differentiation can be made by matching 
> > a regular expression, allow for markup without explicit 
> > differentiation, leaving the differentiation implicitly to the 
> > parser to figure out.  For example, this would be valid:
> >
> > 本が<span class="money"><abbr class="amount" title="1000">一千</
> > abbr><abbr class="currency" title="JPY">円</abbr></span>
> >
> > because it doesn't fit the pattern you suggested, but this would 
> > also be valid:
> >
> > The book is <span class="money">$5.99</span>.
> >
> > because it does follow the pattern, where everything that's not 
> > within a certain character group is considered a currency symbol 
> > (i.e. "$").  If this isn't what you're suggesting, then I'm not 
> > clear on what you're suggesting.
> >
> > But if this is what you're suggesting, I think you're 
> > underestimating the complexity involved in defining which characters 
> > might be part of an amount and which characters might be part of a 
> > currency symbol.  I do a lot of parsing via regular expressions and 
> > a large part of my interest in microformats comes from witnessing 
> > the failure rate in such parsing.  There's always another unexpected 
> > format popping up and before you know it, the regular expression is 
> > a page long.  See this page for a list of regular expressions for 
> > identifying the information that needs to be parsed from currency 
> > values for a quick
> > taste:
> >
> > http://regexlib.com/Search.aspx?k=currency
> >
> > And those are all defining legitimate input much more strictly than 
> > would be appropriate for the web at large.
> >
> > To specifically answer your question of what doesn't work with [A-
> > Za- z0-9], there's the decimal point, which is part of the amount 
> > rather than the currency symbol, and there's any commas, which are 
> > also part of the amount rather than the currency symbol, and any 
> > whitespace characters (of which there are many) shouldn't be 
> > considered part of the amount nor the currency symbol.  That's all I 
> > can think of right now, but I have no doubt there's much more I 
> > haven't thought of, and it's that much more I'm worried about.  So 
> > if we come up with a definition that includes all of that, now we're 
> > talking about explaining to authors that they can only leave out the 
> > currency markup if their class="money" tag is only containing 
> > letters, numbers, decimal points, commas, and whitespace.  Otherwise 
> > they have to explicitly identify the individual parts.
> >
> > I think this is already more confusing than just always identifying 
> > the individual parts, I think it's still likely to cause problems, 
> > and I think it's only helping a slight majority that is quickly 
> > becoming a minority.  English language web pages only comprise about 
> > 55% of the web today, and that percent is quickly shrinking.
> > So I'm publishing my currency in English, and you're trying to ease 
> > my implementation burden, so I don't have to explicitly define my 
> > currency symbol and parsers will just figure it out for me.  What if 
> > I want my whitespace to be marked up with HTML entities? E.g.:
> >
> > The book costs <span class="money">$&nbsp;5.99</span>
> >
> > That's not an unlikely scenario.  I actually publish currency values 
> > like that, when someone wants a space to separate the $ from the 
> > amount, but they don't want the two getting  split onto separate 
> > lines.  Are we going to include that in the regular expression too 
> > or do I need to explicitly identify my symbol?  If it's not allowed, 
> > how will that be explained clearly enough that I won't make this 
> > mistake and wind up with my currency symbol wrongly interpreted as 
> > "$&nbsp;", which doesn't map to any known currency, and will lose my 
> > space if it's replaced by another currency symbol?  This is the kind 
> > of ambiguity that doesn't really help publishers.  And if it is in 
> > the regular expression, how are we going to explain to publishers 
> > that it's okay?  Looks like unnecessary complication to me.
> >
> >> But one final point on this; has this been discussed this with 
> >> those who make the decisions for markup used at the largest sites:
> >> Google, eBay,
> >> Amazon, etc.?  Just curious? (and I don't mean to push this, it's 
> >> just that being pedantic is in my nature, unfortunately. :)
> >
> > There are people from Yahoo! on this list, and Technorati's pretty 
> > big too, so they'd be good people to say whether or not they really 
> > care how long the class names are.
> >
> > Peace,
> > Scott
>
>
> _______________________________________________
> microformats-discuss mailing list
> microformats-discuss@microformats.org
> http://microformats.org/mailman/listinfo/microformats-discuss
>
> _______________________________________________
> microformats-discuss mailing list
> microformats-discuss@microformats.org
> http://microformats.org/mailman/listinfo/microformats-discuss
>


--
- Stephen Paul Weber, Amateur Writer
<http://www.awriterz.org>

MSN/GTalk/Jabber: [EMAIL PROTECTED]
ICQ/AIM: 103332966
NSA: [EMAIL PROTECTED]
BLOG: http://singpolyma-tech.blogspot.com/

_______________________________________________
microformats-discuss mailing list
microformats-discuss@microformats.org
http://microformats.org/mailman/listinfo/microformats-discuss

Reply via email to