Re: pluralization idea that keeps bugging me

2008-02-10 Thread Richard Hainsworth

Brandon S. Allbery KF8NH wrote:


On Feb 9, 2008, at 11:43 , Richard Hainsworth wrote:

I posted an idea about pluralisation could be handled in a way that 
would not be English-centric (Subject: interpolation 
contextualisation). There were no responses to the idea. Was it so 
bad? Did no one see it? Was it too un-perlish? Was the title too 
horrible?


I saw one response, noting that you can define in a module your own 
quote characters with their own interpolation rules; no core changes 
needed.


Why then did Larry ask the original question? Why also did others with 
far better knowledge than I indicate that hooks should be present in 
interpolation to make language-dependent modules possible, thus 
indicating the hooks might not be there?


How - in sketch form - would I go about creating a module to do what I 
suggest? I am not suggesting someone writes a module I have suggested, 
but the barebones steps to creating a new metacharacter.


I have written infix multidispatch functions in pugs with Unicode 
characters to investigate that part of the language. But I dont quite 
see how to go about creating a new interpolatable metacharacter.




Re: pluralization idea that keeps bugging me

2008-02-10 Thread Ryan Richter
On Sun, Feb 10, 2008 at 12:56:14PM +0300, Richard Hainsworth wrote:
 How - in sketch form - would I go about creating a module to do what I 
 suggest? I am not suggesting someone writes a module I have suggested, 
 but the barebones steps to creating a new metacharacter.
 
 I have written infix multidispatch functions in pugs with Unicode 
 characters to investigate that part of the language. But I dont quite 
 see how to go about creating a new interpolatable metacharacter.

The category is called qq_backslash, so you'd define a new
qq_backslash:s or whatever.  I don't think pugs allows new ones to be
defined, though.

-ryan


Re: pluralization idea that keeps bugging me

2008-02-09 Thread Richard Hainsworth

Warnocked!

I posted an idea about pluralisation could be handled in a way that 
would not be English-centric (Subject: interpolation contextualisation). 
There were no responses to the idea. Was it so bad? Did no one see it? 
Was it too un-perlish? Was the title too horrible?


The basic idea would be to add hooks into interpolation to allow for 
context suppliers and context sensors. The context sensors change words 
depending on data supplied through context suppliers.


Note that even in English, if you change a noun from singular to plural, 
you need to change the verb from singular form to plural form.


Larry Wall wrote:

Last night I got a message entitled: yum: 1 Updates Available.
Of course, that's probably just a Python programmer giving up on doing
the right thing, but we see this sort of bletcherousness all the time.

After a recent exchange on PerlMonks about join, I've been thinking
about the problem of pluralization in interpolated strings, where we
get things like:

say Received $m message{ 1==$m ?? '' !! 's' }.

My first thought is that this is such a common idiom that we ought
to have some syntactic sugar for it:

say Received $m message\s.

which reads nicely enough since the usual case is plural.
Basically, \s would be smart enough to magically know somehow whether
the last interpolation was 1 or not.  It would be particular nice when
the interpolation is a closure:

say Received {calculate_number_of_messages()} message\s.

That would cover most of the cases for English speakers using regular
nouns, but I wonder whether there's some kind of generalization that
would help for cases like:

say There was/were $o ox/oxen

But that doesn't work since / isn't a metacharacter.  Using an adverb
seems like overkill, if we can piggyback on an existing metachar.

Maybe something like

say There was\swere $o ox\soxen

where if anything alphabetic follows the \s it is the alternative
plural.  But note that the first \s there would have to be looking
forward rather than backward to do the verb, which constrains the
possible mechanisms, and makes it problematic to use \s multiple times:

say There was\swere $o ox\soxen and $g goat\s.

though that could be made clearer with explicit concatenation:

say There was\swere $o ox\soxen  ~ and $g goat\s.
say There was\swere $o ox\soxen , and $g goat\s.

Or maybe instead of using \ we should use a sigil:

say There $was|were $o $ox|oxen

except, of course, that $ is already taken.  Seems tacky to
use up a real variable name like:

say There $Xwas|were $o $Xox|oxen

I suppose one could make a case for Num vars having a . method though:

say There $owas|were $o $oox|oxen

That nicely resolves the ambiguity of

say There $owas|were $o $oox|oxen and $g goat$gs

but doesn't really help when you really need it, which is when you

interpolate something hairy:

say There $j.k.l.m.owas|were $j.k.l.m.o $j.k.l.m.oox|oxen and $j.k.l.m.g 
goat$j.k.l.m.gs

It's even less helpful when you interpolate a closure since there's
no variable name to refer to (unless you assign one, but then we're
losing much of our syntactic sugary wonderfulness).  So maybe we should
just make \s dwim and leave it at that.  Two dwimminesses, really.
The first dwim finds the associated interpolation, either the first
interpolation of a variable or closure before the \s, or if there
is none, the first one after.  Call that interpolated value $X for
the moment.  (It doesn't really have to have a real variable name,
but the important thing is not to evaluate the expression multiple
times since it might have side effects (including the side effect of
being inefficient to compute).)

The second dwim looks at the alphabeticality of the next character
(defined Unicodically, of course) to decide if there is one argument or two:

foo\s   means   $X == 1 ?? 'foo' !! 'foos'
foo\sbarmeans   $X == 1 ?? 'foo' !! 'bar'

Internally, you end up multiply dispatching to something like
pluralize($X,'foo') or pluralize($X,'foo','bar').  (Arguably we
could make pluralize interpolate the $X as well, but that only
works for noun agreement, not verb agreement.)

I think that probably handles most of the Indo-European cases, and
anything more complicated can revert to explicit code.  (Or go though
a localization dictionary...)

Any other cute ideas?  


Larry
  


Re: pluralization idea that keeps bugging me

2008-02-09 Thread Fagyal Csongor

Hi,

Warnocked!

Indeed :)
I posted an idea about pluralisation could be handled in a way that 
would not be English-centric (Subject: interpolation 
contextualisation). There were no responses to the idea. Was it so 
bad? Did no one see it? Was it too un-perlish? Was the title too 
horrible?


The basic idea would be to add hooks into interpolation to allow for 
context suppliers and context sensors. The context sensors change 
words depending on data supplied through context suppliers.


Note that even in English, if you change a noun from singular to 
plural, you need to change the verb from singular form to plural form.
First of all, I think a module like this should be either perfect or not 
exist at all: you won't use it after it makes the first mistake, or when 
you cannot use it everywhere.
Now, to have a perfect module you need some pretty smart people to 
create the base lib (dealing with natural languages is not a piece of 
cake). Then you need a bunch of other people who understand what's going 
on to to create and test the different language versions. I fear that at 
the end you end up with a huge codebase, created by various people, 
parts of which get out-of-sync or become unmaintained, and which 
generally consumes a lot of memory (think about e.g. dictionaries for 
irregular words - take a look at Lingua::EN::Inflect, for example) when 
used, and also slows down execution. All this to save one 
not-very-often-used if... else block. If we really want to help people 
type less, why not just rename else to e ? :)


It also seems to me that I will need a module like this when my computer 
does not only *ask* where I want to go today, but also *cares*. ;)


So IMHO while it's a nice idea, it's just an overkill. (And it's 
definitely not about Perl6 as a language.)


- Fagzal


Re: pluralization idea that keeps bugging me

2008-02-09 Thread Brandon S. Allbery KF8NH


On Feb 9, 2008, at 11:43 , Richard Hainsworth wrote:

I posted an idea about pluralisation could be handled in a way that  
would not be English-centric (Subject: interpolation  
contextualisation). There were no responses to the idea. Was it so  
bad? Did no one see it? Was it too un-perlish? Was the title too  
horrible?


I saw one response, noting that you can define in a module your own  
quote characters with their own interpolation rules; no core changes  
needed.


--
brandon s. allbery [solaris,freebsd,perl,pugs,haskell] [EMAIL PROTECTED]
system administrator [openafs,heimdal,too many hats] [EMAIL PROTECTED]
electrical and computer engineering, carnegie mellon universityKF8NH




Re: pluralization idea that keeps bugging me

2008-01-31 Thread David Green

On 2008-Jan-26, at 9:58 am, Larry Wall wrote:

My first thought is that this is such a common idiom that we ought
to have some syntactic sugar for it:
say Received $m message\s.


I've always wanted a magic-S (and I don't think the anglocentrism  
matters, because Perl is already pretty anglocentric -- more so than  
plural S's, which apply to some other languages anyway).


Rather than extra syntax to specify alternatives, I wonder about  
having \s work with arrays (which also provides a way to deal with  
duals, for example):


say Received $_ {ox oxen oxes}\s for 1, 2, 77;
Received 1 ox
Received 2 oxen
Received 77 oxes

It might even be sophisticated enough to guess whether it should add  
es or just s, but anything beyond that probably belongs in a module.


use Locale::Lingua::EN;
say There was\s {3} ox\s;# There were 3 oxen

use Locale::Lingua::Romana::Perligata;
say {3} bos\s erat\s;# 3 boves erant

Although calling it \s loses its impact in other languages  But  
I think the underlying idea to seize on is a way to grab interpolated  
values so that there's a nice way to do tricks like that.  Preferably  
in a way that doesn't look symmetrical so you can point it before or  
behind.


say I've got $bid dollar\s, do I hear {$ + 1}?

Or using $ instead of \s:
say I'm bid $d dollar$ for @this[]$ $o @ox[]$

...except that I'm not crazy about calling it $.  (If that would  
even work.)  But something like that.  Perhaps strings should build an  
array of their interpolations?


say $a $b $c, this string contains [EMAIL PROTECTED] interpolations

(Then again, maybe there's a time to break down and use (s)printf.)


-David



Re: pluralization idea that keeps bugging me

2008-01-31 Thread Mark Overmeer
* David Green ([EMAIL PROTECTED]) [080131 08:48]:
 I've always wanted a magic-S (and I don't think the anglocentrism  
 matters, because Perl is already pretty anglocentric -- more so than  
 plural S's, which apply to some other languages anyway).

In the good old days all computer OSes were anglo-centric.  They are
not like that anymore.  But Perl still is.

 use Locale::Lingua::EN;
 say There was\s {3} ox\s;# There were 3 oxen
 
 use Locale::Lingua::Romana::Perligata;
 say {3} bos\s erat\s;# 3 boves erant
 
 Although calling it \s loses its impact in other languages  But  
 I think the underlying idea to seize on is a way to grab interpolated  
 values so that there's a nice way to do tricks like that.  Preferably  
 in a way that doesn't look symmetrical so you can point it before or  
 behind.

As I suggested in a previous mail, we can do it by making say/print
a bit smarter.  Instead of interpolating before they are called,
we let them interpolate themselves, and optionally translate.

 say I've got $bid dollar\s, do I hear {$ + 1}?

pre-parse standard call to (s)print(f)/say from

  say I've got $bid dollar, do I hear , $bid+1, ?

into

  print I've got \Ibid\E dollar, do I hear \I__ANON1__\E?\n,
  bid = $bid, __ANON1 = $bid+1, __LINE = __LINE__;

(introducting \I \E as interpolation indicators)
Isn't the (usual, existing) translation syntax a lot simpler than
you suggest?
(the rewrite will take place as first step within the print/say.
Translations must be implemented in the output layers, because only there
we know enough about character-set and end-user.  Within the program,
you do not want to be bothered with translated strings)

The default interpolation implementation for print() can be very simple.
However, now we can also make translation modules which use external
tables or databases to do the optional intelligent work.

I do not think that your
   use Locale::Lingua::Romana::Perligata;
is usable, because the translation (in general) adapts to the language
of the user of the module, not the likings of the author.  A more
general use is:
   setlocale('lat')

   open OUT, :language('lat'):encoding('latin1'), $f

-- 
   MarkOv


   Mark Overmeer MScMARKOV Solutions
   [EMAIL PROTECTED]  [EMAIL PROTECTED]
http://Mark.Overmeer.net   http://solutions.overmeer.net



Re: pluralization idea that keeps bugging me

2008-01-31 Thread David Green

On 2008-Jan-31, at 2:38 am, Mark Overmeer wrote:

* David Green ([EMAIL PROTECTED]) [080131 08:48]:
I've always wanted a magic-S (and I don't think the anglocentrism  
matters



In the good old days all computer OSes were anglo-centric.  They are
not like that anymore.  But Perl still is.


Well, they provide ways to localise text, which is good; and a lot of  
applications take advantage of it, which is better.  Most programming  
languages themselves are still English though, or at least their  
vocabulary is based on English and English-like words.  (Except for  
the ones that aren't.)  Fortunately, since Perl6 is ultimately  
mutable, it should be reasonably straightforward to translate it all  
so that you could start programs with use Dutch or use Japanese.


use Lingua::FR;
mes @valeurs=(1,2,3);
dis $_ pour @valeurs;#pardon my French

Of course, while some languages might need only to translate all the  
function and variable names, others would arguably want to rearrange  
the grammar and syntax too, so reasonably straightforward is a  
relative term


The brute force way would be to redefine every function with a new  
name for the new language; but perhaps what we really want is a more  
elegant way to do that:

sub foo :trans(fr=fue, de=fu) {...}


I do not think that your  use Locale::Lingua::Romana::Perligata;
is usable, because the translation (in general) adapts to the language
of the user of the module, not the likings of the author.


That depends on the circumstances; the author(s) still have to provide  
translations in the first place.  I wasn't thinking of ways to handle  
multiple languages, just ways to let the author more easily use his  
own language (which is what the majority of perl programs do, since  
most of them are written for private use).



As I suggested in a previous mail, we can do it by making say/print
a bit smarter.  Instead of interpolating before they are called,
we let them interpolate themselves, and optionally translate.


I really like the idea of having text lazily interpolate/translate.   
And in P6, it will be possible to override quoting and concatenating  
so that the code doesn't even have to look any different.  Being able  
to refer to the interpolated values is a bit of a different matter, I  
think; although I guess the main reason for wanting to do so is  
translating (including singular-to-plural translations).


Flexible interpolation is good because it makes text look more  
natural, as opposed to a printf-like separation of parts.  On the  
other hand, the more natural the text looks in one language, the more  
work it can be to translate it automatically.  The magic-S is more of  
a shortcut when you're not doing real translations at all.



-David



Re: pluralization idea that keeps bugging me

2008-01-28 Thread Ron
On 26 Jan., 17:58, [EMAIL PROTECTED] (Larry Wall) wrote:
 Last night I got a message entitled: yum: 1 Updates Available.
 Of course, that's probably just a Python programmer giving up on doing
 the right thing, but we see this sort of bletcherousness all the time.

 After a recent exchange on PerlMonks about join, I've been thinking
 about the problem of pluralization in interpolated strings, where we
 get things like:

 say Received $m message{ 1==$m ?? '' !! 's' }.

 Any other cute ideas?

When you are a MUD[*]-developer you have to deal with things like this
all the time.
Where I was we did it like this (german sentence converted to perlish
syntax)

   say {der($player)} nimmt {den($item)} aus {dem($container)}.;

which means:

$player takes $item out of $container

where $player, $item and $container are objects or hashes and der(),
den(), dem() are functions which convert the given object into the
definite nominative, accusative and dative.
There are more functions to implement indefinitive cases and other
grammatical things.

Objects/hashes contain the number, adjectives etc.

To make that more english that could look like:

   say {nominative($player)} takes {accusative($item)} out of
{dative($container)}.;

With $player={name=Paul, adjective=great, gender=male},
$item={name=ball, count=3, gender=male},
$container={name=box, gender=male}, it would interpolate into

   The great Paul take the 3 balls out of the box.

Maybe... btw: in german the gender of the objects also changes
things...

The orignal example
 say Received $m message{ 1==$m ?? '' !! 's' }.
could then look like:

say Recieved {nominative({name='message',count=$m})}.

Maybe someone could find a more concise form if huffmanly desireable.

Regards,
Ron



Re: pluralization idea that keeps bugging me

2008-01-27 Thread Moritz Lenz
Larry Wall wrote:
 Last night I got a message entitled: yum: 1 Updates Available.
 Of course, that's probably just a Python programmer giving up on doing
 the right thing, but we see this sort of bletcherousness all the time.
 
 After a recent exchange on PerlMonks about join, I've been thinking
 about the problem of pluralization in interpolated strings, where we
 get things like:
 
 say Received $m message{ 1==$m ?? '' !! 's' }.
 
 My first thought is that this is such a common idiom that we ought
 to have some syntactic sugar for it:
 
 say Received $m message\s.
 
 which reads nicely enough since the usual case is plural.
 Basically, \s would be smart enough to magically know somehow whether
 the last interpolation was 1 or not.  It would be particular nice when
 the interpolation is a closure:
 
 say Received {calculate_number_of_messages()} message\s.

I think the most general solution is a nice quoting construct.

So if you say

say qq:l10n(en)Received $m message\s;
the quote handler in l10n:en (or whatever) receives a list of pairs of
strings and variables to interpolate, ['$m' = $m, '\s' = undef].

It can then decide what to do with it.

Wait, that smells like macros, which are already specced - so never mind ;-)

Moritz

-- 
Moritz Lenz
http://moritz.faui2k3.org/ |  http://perl-6.de/



signature.asc
Description: OpenPGP digital signature


Re: pluralization idea that keeps bugging me

2008-01-27 Thread Mark Overmeer
* Larry Wall ([EMAIL PROTECTED]) [080126 16:58]:
 Last night I got a message entitled: yum: 1 Updates Available.
 After a recent exchange on PerlMonks about join, I've been thinking
 about the problem of pluralization in interpolated strings, where we
 get things like:
 
 say Received $m message{ 1==$m ?? '' !! 's' }.
 
 Any other cute ideas?  

I totally agree with many responses, that special support for the English
language is not preferred, certainly when it bothers developers for
other natural languages.  Imagin that you wrote your code this way for
a website, and then your boss (always blame your boss) decides that the
site must be ported to Chinese for expansion...

It would be nice if Perl joined nearly all other Open Source applications,
in being multi-languaged.  During the lightningtalks of last YAPC::EU,
I called for localization of error messages in Perl 5.12, but Perl6
improvements are welcomed as well.

My idea: Recently, I released Log::Report, which is a new translation
framework.  It combines exception-handling with report dispatch and
translations.  What's new: some module produces a text, but that module
was found on CPAN.  Only the author of the main program knows how to
handle the text.  So, delay translations until an output layer is reached.

Locale::TextDomain and gettext translate immediately, as does $!  They
translate on the location where the report emerges.  Log::Dispatch and
Log::Log4perl cannot influence the text production process.

What my new Log::Report does, is delaying translations to the moment
it reaches the dispatcher.  Like this:

   package main;
   
   dispatcher SYSLOG = 'syslog', language = 'en-US',
  charset = 'ascii', facility = 'local4';

   dispatcher STDOUT = 'website', language = 'cn',
  charset = 'utf8';

   run_some_code();  # text both to syslog and stdout

   package Someone::Elses::Package;
   use Log::Report 'translation-table-namespace';
   
   sub run_some_code()
   {   # Locale::TextDomain compatible syntax, info ~ print
   info __nxReceived {m} messages, $m, m = $m;
   }

To syslog in English (what I understand), and to the website in Chinese
(what I do not understand) Of course, there are quite some more features
in the module.

The translation tables can have gettext syntax, database driven, or maybe
a module with Perl routines from complex languages.  (Only the first is
implemented on the moment, but the framework is present).

The provided try() is also implemented as dispatcher, which collects
the messages from the block, and has not yet been translated:

  try { error __help! };
  if($@)   # an Log::Report::Dispatcher::Try object
  {   my $exception = [EMAIL PROTECTED]wasFatal;
  $exception-throw   # re-cast
  if $ex-message !~ m/help/;  # ignore call for help
  }


When someone starts coding, it is more and more uncertain in which
languages it will be used later.  So, it would be nice to help people
to avoid mistakes which may block an easy conversion.

For instance, best if texts are produced in as large blocks as possible,
outside the program file.  We know how to do that: a template system.
Templates themselves are easily translatable.  About a zillion or two
CPAN modules implement a Locale::TextDomain-like HASH-based substitution
system in templates.

Translations are impossible for syntaxes like this:
  print Received $m messages
because the $m is already filled-in before print is called.  For this
reason, a lookup in the translation table is impossible.

It would be nice to not translate above string into
   print 'Received '.$m.' messages'
but
   report info = 'Received {m} messages', m = $m,
   linenr = __LINE__, ..etc..
(of course, some \Q\E like meta-syntax, not {})

Print() works internally more like printf().  No problem.  Without
translation tables defined, it just takes what it got as first argument.

In the infrastructure, we need a reason for each message, like syslog
levels.  Print, warn, and die have implied reasons (resp info, warn
and error).  Everyone is tricking trace and verbose levels, so we need
a few more useful levels.


Concluding:
 - hopefully, there is a way to simplify the work for all of us who do
   need to support many languages within one application
 - create one standard, so all CPAN modules integrate in the same way
 - let's try to get Perl to handle languages!
-- 
Regards,
   MarkOv


   Mark Overmeer MScMARKOV Solutions
   [EMAIL PROTECTED]  [EMAIL PROTECTED]
http://Mark.Overmeer.net   http://solutions.overmeer.net



Re: pluralization idea that keeps bugging me

2008-01-27 Thread Richard Hainsworth
Perl - when I first met it - was great because it handled text easily 
and 'naturally'. I now use perl for everything, even when another 
language would probably be better.


Perl6 has gone a long way to making things more universal by using 
UNICODE, (The difficulties of non-Latin fonts and coding are horrendous).


Mark and chromatic are right that an ability to manipulate multiple 
languages naturally and in core would be something no other 
programming language does.


Perl6 seems to handle most of the necessary things, but not all - I 
think. Hence Larry's original question.


There are - it seems to me - several different aspects to consider. My 
breakdown would be:


a) having the language constructs that make text interpolation easy - 
that is the *text* morphs itself to adjust to the context brought in by 
the interpolated data. What is necessary is not a plurals fix for 
English, but a mechanism for fixing that can be applied to other 
languages. (Here I think perl6 grammars will help, but I am not sure, 
and without proof of concept actually doubt the facility exists in perl).


b) Translating the perl core itself - the use of other languages to 
write code in. Given perl6 grammar, and given that any programming 
language is a rigidly circumscribed subset of words, I think this is 
entirely possible in most natural languages. Clearly for the compiler to 
work, an non-English coding language must uniquely map to and from an 
equivalent English coding.


c) Having the mechanisms in perl6 core not just to interpolate text 
contextually, but also for different texts to be used with the same 
interpolations (when a text is translated, different sentence structures 
result). As Mark pointed out, this can be accomplished with Templates.


d) Ensuring that different information streams can each be directed 
through templates. As Mark pointed out, more is needed than standard 
input, output, and errors. Moreover, it would be fantastic if the output 
from the perl6 compiler could be constructed so that its information 
streams (warnings, errors, etc) could be attached to translation filters.


I think item (a) is not quite there in perl6. But I really want to use 
perl6 and I hope this line of development does not derail the fantastic 
amount of momentum we have seen in recent months.



Mark Overmeer wrote:

* Larry Wall ([EMAIL PROTECTED]) [080126 16:58]:
  

Last night I got a message entitled: yum: 1 Updates Available.
After a recent exchange on PerlMonks about join, I've been thinking
about the problem of pluralization in interpolated strings, where we
get things like:

say Received $m message{ 1==$m ?? '' !! 's' }.

Any other cute ideas?  



I totally agree with many responses, that special support for the English
language is not preferred, certainly when it bothers developers for
other natural languages.  Imagin that you wrote your code this way for
snip
To syslog in English (what I understand), and to the website in Chinese
(what I do not understand) Of course, there are quite some more features
in the module.
snip
Concluding:
 - hopefully, there is a way to simplify the work for all of us who do
   need to support many languages within one application
 - create one standard, so all CPAN modules integrate in the same way
 - let's try to get Perl to handle languages!
  


Re: pluralization idea that keeps bugging me

2008-01-26 Thread Amir E. Aharoni
On 26/01/2008, Larry Wall [EMAIL PROTECTED] wrote:
 After a recent exchange on PerlMonks about join, I've been thinking
 about the problem of pluralization in interpolated strings, where we
 get things like:

 say Received $m message{ 1==$m ?? '' !! 's' }.

 ...

 Any other cute ideas?

No matter what you do it will remain too English-centric. It might
work for Catalan, too. But it will remain totally useless for Arabic
or Chinese.

In any case, i don't understand why should this be in the core language at all.

-- 
Amir Elisha Aharoni

English -  http://aharoni.wordpress.com
Hebrew  - http://haharoni.wordpress.com

We're living in pieces,
 I want to live in peace. - T. Moore


Re: pluralization idea that keeps bugging me

2008-01-26 Thread Jonathan Lang
Larry Wall wrote:
 Any other cute ideas?

If you have '\s', you'll also want '\S':

$n cat\s fight\S # 1 cat fights; 2 cats fight

I'm not fond of the 'ox\soxen' idea; but I could get behind something
like '\sox oxen' or 'ox\sen'.

'\sa b' would mean 'a is singular; b is plural'
'\sa' would be short for '\s a'
'\s' would be short for '\s s'
\Sa b' would reverse this.

Sometimes, you won't want the pluralization variable in the string
itself, or you won't know which one to use.  You could use an adverb
for this:

:s$nthe cat\s \sis are fighting.

and/or find a way to tag a variable in the string:

$owner's \s=$count cat\s

'\s=$count' means set plurality based on $count, and display $count normally.

-- 
Jonathan Dataweaver Lang


Re: pluralization idea that keeps bugging me

2008-01-26 Thread Austin Hastings
Jonathan makes an excellent point about s and S. In fact, there's 
probably a little language out there for this.


I don't think it needs to be in the core, though. But you could put in 
some kind of hook mechanism, so that detecting the presence of \s or 
whatever caused the string to be treated specially. Perhaps it gets a 
different, possibly more sophisticated, type? A type that is only 
in-core in a limited (English-only?) implementation, but which admins 
can install at whim.


=Austin


Jonathan Lang wrote:

Larry Wall wrote:
  

Any other cute ideas?



If you have '\s', you'll also want '\S':

$n cat\s fight\S # 1 cat fights; 2 cats fight

I'm not fond of the 'ox\soxen' idea; but I could get behind something
like '\sox oxen' or 'ox\sen'.

'\sa b' would mean 'a is singular; b is plural'
'\sa' would be short for '\s a'
'\s' would be short for '\s s'
\Sa b' would reverse this.

Sometimes, you won't want the pluralization variable in the string
itself, or you won't know which one to use.  You could use an adverb
for this:

:s$nthe cat\s \sis are fighting.

and/or find a way to tag a variable in the string:

$owner's \s=$count cat\s

'\s=$count' means set plurality based on $count, and display $count normally.

  




Re: pluralization idea that keeps bugging me

2008-01-26 Thread Dr.Ruud
Jonathan Lang schreef:

 I'm not fond of the 'ox\soxen' idea; but I could get behind something
 like '\sox oxen' or 'ox\sen'.

   $n ox\s en

   $n\sone multiple no cat\s s  fight\s s s

;)

-- 
Affijn, Ruud

Gewoon is een tijger.


Re: pluralization idea that keeps bugging me

2008-01-26 Thread Fagyal Csongor

Amir E. Aharoni wrote:

On 26/01/2008, Larry Wall [EMAIL PROTECTED] wrote:
  

After a recent exchange on PerlMonks about join, I've been thinking
about the problem of pluralization in interpolated strings, where we
get things like:

say Received $m message{ 1==$m ?? '' !! 's' }.

...

Any other cute ideas?



No matter what you do it will remain too English-centric. It might
work for Catalan, too. But it will remain totally useless for Arabic
or Chinese.

In any case, i don't understand why should this be in the core language at all.

I second that.

A few more thoughts:

1. For example in Hungarian, you don't need this at all: the noun stays 
singular after the numeral.


2. AFAIK in some languages it's not 1 ore more, but 1, 2 or more.

3. It's often not 1 or more what you need, but none, 1 ore more. No 
new messages - You have 1 new message - You have 3 new messages. Or 
more likely bNow new messages./b - a href=/read.aspYou have 1 
new message./a ... etc.


4. I work a lot with multilingual websites. I have learned long ago that 
it's never {{you_have}} [% messages %] {{messages}}. You have to be 
*very* lucky just to make this work in two languages. Instead, it's 
{{number_of_new_messages}}: [% messages %]. That pretty much works 
everywhere.



So not in the core, probably. There are too many exceptions. A module 
would be cool, though :) String::Plural::English, or whatnot.




- Fagzal




Re: pluralization idea that keeps bugging me

2008-01-26 Thread Yuval Kogman
To me this sounds like

use Lingua::EN::Pluralize::DSL;

which would overload your grammar locally to parse strings this way.

However, due to i18n reasons this should not be in the core.

It might make sense to ship a slightly modernized Locale::MakeText
with Perl 6 so that it can be used in the compiler itself, but
unless a fully open ended system like L::MT is included I think
having anything at all might be damaging, because this will
encourage people to use the partial solution that is already built
in instead of the complete on eon the CPAN (c.f. many core modules).

-- 
  Yuval Kogman [EMAIL PROTECTED]
http://nothingmuch.woobling.org  0xEBD27418



Re: pluralization idea that keeps bugging me

2008-01-26 Thread chromatic
On Saturday 26 January 2008 08:58:43 Larry Wall wrote:

 That would cover most of the cases for English speakers using regular
 nouns, but I wonder whether there's some kind of generalization that
 would help for cases like:

     say There was/were $o ox/oxen

That makes me wish for a subjunctive/optative mood marker.  I'm not sure why.

In-language localization and internationalization hooks do seem awfully 
useful, but English-only pluralization rules just might not cut it.

Nearly pain-free l10n and i18n *is* kind of a killer feature though.

-- c


Re: pluralization idea that keeps bugging me

2008-01-26 Thread Darren Duncan

At 8:58 AM -0800 1/26/08, Larry Wall wrote:

My first thought is that this is such a common idiom that we ought
to have some syntactic sugar for it:

say Received $m message\s.


I don't think that a feature like this should be in the core 
language; it is too complicated as well as an open-ended problem.


A better use of this discussion is perhaps to determine whether any 
more basic core features would need updating in order to support a 
separate extension module to more easily provide the feature that was 
being discussed.


-- Darren Duncan


Re: pluralization idea that keeps bugging me

2008-01-26 Thread Patrick R. Michaud
On Sat, Jan 26, 2008 at 08:58:43AM -0800, Larry Wall wrote:
 After a recent exchange on PerlMonks about join, I've been thinking
 about the problem of pluralization in interpolated strings, where we
 get things like:
 
 say Received $m message{ 1==$m ?? '' !! 's' }.
 
 My first thought is that this is such a common idiom that we ought
 to have some syntactic sugar for it:
 
 say Received $m message\s.

 [...]

 Any other cute ideas?  

FWIW, this sounds to me a lot like a special quoting operator or
adverbial form.

say qq:pluralized Received $m message\s.

Pm


Re: pluralization idea that keeps bugging me

2008-01-26 Thread Gianni Ceccarelli
On 2008-01-26 Larry Wall [EMAIL PROTECTED] wrote:
 Last night I got a message entitled: yum: 1 Updates Available.
 [snip a lot]
 I think that probably handles most of the Indo-European cases, and
 anything more complicated can revert to explicit code.  (Or go though
 a localization dictionary...)

Please don't put this in the language. The problem is harder than it
seems (there are European languages that pluralize differently on $X %
10, IIRC; 0 is singular or plural depending on the language, etc etc).

Look at the documentation of GNU gettext, or the translation
guidelines for KDE, to get the whole mess.

We already have Locale::MakeText. To get the whole magical
interpolation, we'd just have to define a suitable quoting construct,
right?

I know Perl is not minimal, but sometimes I feel that it will end up
being maximal... and the more you put in the core, the less
flexibility you get in the long term.

-- 
Dakkar - Mobilis in mobile
GPG public key fingerprint = A071 E618 DD2C 5901 9574
 6FE2 40EA 9883 7519 3F88
key id = 0x75193F88

printk(%s: Boo!\n, dev-name);
linux-2.6.19/drivers/net/depca.c


signature.asc
Description: PGP signature


Re: pluralization idea that keeps bugging me

2008-01-26 Thread Richard Hainsworth
Its only English centric if the idea is fixed to plurals, because its 
only for plurals where English words are mutated by grammar rules.


In other languages, words are mutated by other factors, such as the 
gender of the word, the case, and the number.


The problem can be quite difficult, say in Russian. Suppose you want to 
say something like Respected customer name and interpolate customer 
name from a database. In English, its a doddle. But in Russian, all 
adjectives (eg. 'respected') have both male and female forms, so the 
gender of customer has to be determined in order to correctly interpolate.


And for plurals, some languages have different words for single, double 
and many forms. In
Russian, the noun after the number has one form for 1 (nominative 
singular), another form (genitive singular) for numbers 2 to 4, and then 
a third form (genitive plural) for 5 and above. So, a simple plural hook 
is insufficient.


Then take Welsh, its words mutate with prefixes as well as suffixes 
dependent on context.


Whilst it would be nice for there to be a neat syntax for such things 
(thus avoiding English-centricity), the complexities of all languages 
might be too burdensome for core perl6.


Amir E. Aharoni wrote:

On 26/01/2008, Larry Wall [EMAIL PROTECTED] wrote:
  

After a recent exchange on PerlMonks about join, I've been thinking
about the problem of pluralization in interpolated strings, where we
get things like:

say Received $m message{ 1==$m ?? '' !! 's' }.

...

Any other cute ideas?



No matter what you do it will remain too English-centric. It might
work for Catalan, too. But it will remain totally useless for Arabic
or Chinese.

In any case, i don't understand why should this be in the core language at all.

  


Re: pluralization idea that keeps bugging me

2008-01-26 Thread Jonathan Lang
Gianni Ceccarelli wrote:
 Please don't put this in the language. The problem is harder than it
 seems (there are European languages that pluralize differently on $X %
 10, IIRC; 0 is singular or plural depending on the language, etc etc).

-snip-

 I know Perl is not minimal, but sometimes I feel that it will end up
 being maximal... and the more you put in the core, the less
 flexibility you get in the long term.

This _does_ appear to be something more suitable for a Locale::
module.  I just wonder if there are enough hooks in the core to allow
for an appropriately brief syntax to be introduced in a module: can
one roll one's own string interpolations as things stand?  E.g., is
there a way to add meaning to backslashed characters in a string that
would normally lack meaning?

Do we have the tools to build $m tool\s?

-- 
Jonathan Dataweaver Lang


Re: pluralization idea that keeps bugging me

2008-01-26 Thread jesse



On Sat, Jan 26, 2008 at 08:58:43AM -0800, Larry Wall wrote:
 Last night I got a message entitled: yum: 1 Updates Available.
 Of course, that's probably just a Python programmer giving up on doing
 the right thing, but we see this sort of bletcherousness all the time.
 
 Any other cute ideas?  
 

It's worth reading the perldoc for Locale::Maketext and
Locale::Maketext::TPJ13.  Sean Burke did some truly excellent work
explain a lot of the pitfalls here. Sean built us the only solution I've
yet seen that gets pluralization reasonably ok in languages with
non-English-like pluralization rules without making me want to just give
up and write Updates found: 1 ;)

-j


Re: pluralization idea that keeps bugging me

2008-01-26 Thread Jonathan Lang
Yuval Kogman wrote:
 You can subclass the grammar and change everything.

 Theoretically that's a yes =)

Right.  One last question: is this (i.e., extending a string's
grammar) a keep simple things simple thing, or a keep difficult
things doable thing?

-- 
Jonathan Dataweaver Lang


Re: pluralization idea that keeps bugging me

2008-01-26 Thread Yuval Kogman
On Sat, Jan 26, 2008 at 18:43:50 -0800, Jonathan Lang wrote:

 Right.  One last question: is this (i.e., extending a string's
 grammar) a keep simple things simple thing, or a keep difficult
 things doable thing?

I'm going to guess somewhere in between.

It should be about the same level of complexity as Filter::Simple,
except with much finer control and more correctness.

I'm not the best person to answer this though.

-- 
  Yuval Kogman [EMAIL PROTECTED]
http://nothingmuch.woobling.org  0xEBD27418



pgpGuOUMaC21l.pgp
Description: PGP signature


Re: pluralization idea that keeps bugging me

2008-01-26 Thread Yuval Kogman
On Sat, Jan 26, 2008 at 18:12:17 -0800, Jonathan Lang wrote:

 This _does_ appear to be something more suitable for a Locale::
 module.  I just wonder if there are enough hooks in the core to allow
 for an appropriately brief syntax to be introduced in a module: can
 one roll one's own string interpolations as things stand?  E.g., is
 there a way to add meaning to backslashed characters in a string that
 would normally lack meaning?

You can subclass the grammar and change everything.

Theoretically that's a yes =)

-- 
  Yuval Kogman [EMAIL PROTECTED]
http://nothingmuch.woobling.org  0xEBD27418



pgpY4J1EXkC6j.pgp
Description: PGP signature