Re: threads?

2010-10-12 Thread Karl Brodowsky
I agree that threads are generelly a difficult issue to cope.  What is
worse, there are a lot of Java-developers who tell us, that it is not
difficult for them,
but in the end the software fails on the productive system, for example
because the load is different then on the test system, causing different
threads
to be slowed down to a different extent etc.  So people who are having
difficulties with multithreading still use them a lot and don't admit
the difficulties
and they might not even appear during testing...

Even though I did see software that heavily uses multithreading and
works well.

On the other hand I think that there are certain tasks that need to use
some kind of parallelism, either for making use of parallel CPU
infrastructure or
for implementing patterns that can more easily be expressed using
something like multithreading.

I think that the approach of running several processes instead of
several threads is something that can be considered in some cases, but I
think it does
come with a performance price tag that might not be justified in all
situations.

Maybe the actor model from Scala is worth looking at, at least the
Scala-guys claim that that solves the issue, but I don't know if that
concept can easily
be adapted for Perl 6.

Best regards,

Karl



Re: 1.23 becomes Rat

2009-08-29 Thread Karl Brodowsky

James Cloos:

If so, please use something compatable with ieee 754 decimal floats, so
that they can be used when running on hardware which supports them.

Even w/o such hardware, gcc (at least) has support for software
emulation of _Decimal32 and _Decimal64 (and _Decimal128?).
  

I think there are different ways to go.
The traditional fixed byte length numbers like float32, float64, int16,
int32, int64 etc. that
are quite common in most programming languages.  But especially for
integers, where it
makes most sense, it has become popular to have arbitrary length integers.

Using something like decimal32 or decimal64 fits into this fixed byte
length world and is
quite useful for many things without doubt.  BigDecimal in Java and
LongDecimal for Ruby
tend to go the other way, so they basically combine an arbitrary length
integer with some
additional information to make it a LongDecimal or BigDecimal or so, so
they are not
compatible with ieee 754, but they allow the int part to grow to
arbitrary size.

Both ways are possible and usually one would expect to find them in
addon libraries (perl-modules
in our context here).

Going for such a type by default would require to integrate it into the
language.  But there
might be some concerns for going this direction instead of good old
float.  One side of the
story is that float is way better in terms of performance, because it
has been implemented in most
common hardware for about 20 years now (and decent hardware
implementations have been around
much longer than that), so the hardware implementation should be mature and
well optimized.  On the other hand people are used to the behaviour of
float and use decimal
types where some kind of explicit control is needed, like in finance
applications.

Best regards,

Karl




1.23 becomes Rat (Re: Synopsis 02: Range objects)

2009-08-27 Thread Karl Brodowsky

Larry Wall wrote:

Another note, it's likely that numeric literals such as 1.23 will turn
into Rats rather than Nums, at least up to some precision that is
pragmatically determined.
  
Doing these as Rat would avoid a  lot of the precision issues that 
floating point
arithmetic has all the time.  It will actually work perfectly well with 
addition, because
the denominator is always a small power of 10, so that is true for the 
sum as well.
Multiplying might be an issue, because the denominator becomes a large 
power of 10,
but I think that that can be handled pretty well, unless the 
multiplication is really performed

to an extent that the result uses significant amounts of memory.
But as soon as division is occuring, these rational numbers tend to 
develop denominators
that are not powers of 10 any more.  Combining this with some 
multiplications and additions
this may result in huge numerators and denominators that are somewhat 
expensive to handle.


So what would happen after such a long calculation:

- would the Rats somehow know that they are all derived from Rats that 
were just used instead of floats because of being within a pragmatically 
determined precision?  Then the result of * or / could just as 
pragmatically become a floating point number?
- would the Rats grow really huge numerators and denominators, making it 
expensive to work with them?
- would the first division have to deal with the conversion from Rat to 
floating point?
- or should there be a new numeric type similar to Rat that is always 
having powers of 10 as denominator (like BigDecimal in Java or 
LongDecimal for Ruby or decimal in C# or so)? 

Even in this last case the division is not really easy to define, 
because the exact result cannot generelly be expressed with a 
denomonator that is a power of 10.

This can be resolved by:
- requires additional rounding information (so writing something like 
a.divide(b, 10, ROUND_UP) or so instead of a/b
- implicitely find the number of significant digits by using partial 
derivatives of f(x,y)=x/y

- express the result as some kind of rational number
- express the result as some kind of floating point number.


Regards

Karl




Re: Synopsis 02: Range objects

2009-08-26 Thread Karl Brodowsky

Michael Zedeler schrieb:
Well... maybe. How do you specify the intended precision, then? If I 
want the values from 1 to 2 with step size 0.01, I guess that writing


1.00 .. 2.00

won't be sufficient. Trying to work out the step size by looking at 
the precision of things that are double or floats doesn't really sound 
so feasible, since there are a lot of holes in the actual 
representation, so 1.0001 may become 1.0, yielding very different 
results.
That is a general problem of floats.  We tend to write them in decimal 
notation, but internally they use a representation which is binary.  And 
it is absolutely not obvious what the precision of 1.0001 might be.
There could be a data type like LongDecimal in Ruby or BigDecimal in 
Java, that actually has a knowlegde of its precision and whose numbers 
are fractions with a power of 10 as the denominator.  But for
floats I would only see the interval as reasonably clear.  Even a step 
of 1 is coming with some problems, because an increment of 1 does not 
have any effect on floating point numbers like 1.03e300 or so.


Regards,

Karl




Unicode support in Emacs

2004-03-23 Thread Karl Brodowsky
Larry Wall wrote:

 Well, it's too bad the emacs developers are lagging behind the vim
 developers in this area, but it might (or might not) have something to
 do with the fact that certain obnoxious people like me were bugging
 the vim folks incessantly to get their Unicode story straight for a
 couple of years before they actually did it.  :-)
About 10 years ago I wrote an email to Richard Stallman, who was at that time
the maintainer for Emacs.  And I asked him about Unicode.  But at that time
he had already thought of his own thing, slightly different than unicode,
maybe slightly smarter...  And he wrote me that this would be the way to
go. :-(
I get the impression that Unicode-support has kind of gone on top of this stuff
and I must admit that the way I am currently using Unicode is to edit the stuff
with \ucafe\ubabe-kind of replacements and run perlscripts to convert for example
my private html-format into WWW-html.  No, we should remind the Emacs-developers
that the Unicode-support is at least pretty hard to handle for slightly below
average users like me...
 I would in particular like to thank Bram Moolenaar for not writing us
 out of the book of life for all our whining.  The Unicode support in
 vim has been rock solid, and we are grateful.
Maybe I'll start using that one day. ;-)

Sorry for the off-topic, but I think that it has some importance for Perl
in the sense that it is good to go the right way and not to wait until
Emacs supports it out of the box without 1435 lines of lisp.
Best regards,

Karl

P.S. Don't get me wrong:  RMS is a good guy and he has done a lot of useful and
good stuff, part of which many of us are using all the time.
P.P.S. I don't think that RMS is the current maintainer of Emacs.




Re: zip

2004-03-21 Thread Karl Brodowsky
Goplat wrote:

I have quite a few fonts, the only one I can find where | is a broken bar is
Terminal, a font for DOS programs that uses the cp437 charset, which is
incompatable with latin1 (« and » are AE and AF instead of AB and BB) and it
dosen't even have a ¦. So, it dosen't seem like a problem.
It is still easy to confuse, but why worry?  Larry's suggestion to use ¥ (JPY-sign)
looks much better anyway.
I think it is always important to remember that it is not only writing Perl6, but also
reading Perl6 that has to be doable.  Two many equivalent ways to write the same thing
mean that the reader has to learn more.  I think that Perl is very strong with the
writing part.  It is relatively easy and efficient to write in Perl, but the reading
part is more of a challenge.  That 1 and l look so similar is just due to the stupid
convention to use fonts that make these two look very similar for source codes.  But
why add another problem with | and ¦ which do look similar if the resolution and size
are low, if the ¥ can do the same thing in a better way?  Introducing a z as a second
alternative instead of ¥ might also cost something in terms of learning to read perl.
The infix-operators that consist of letters are something that has to be learned very
well in order to read perl-sources that have been written by others.  So it might be
good to have not too many of them.
Best regards,

Karl



broken bar (Re: Some questions about operators.)

2004-03-20 Thread Karl Brodowsky
Dear All,

I think that the broken bar is dangerous.  Why:
It can be mixed up with the normal bar |.  In some fonts it looks the same.
And to many people it is not 100% clear, which of the two bars is the broken
one and which not.
Off course it is possible to avoid this, but that is not solving the problem
of reading perl-code that someone else has written.  The «» are not such a problem.
But I would think that it would still be worth considering to avoid the broken bar.
Sorry if this disscussion has been performend 1000 times already.

Best regards,

Karl



Re: Funky «vector» operator

2004-03-19 Thread Karl Brodowsky
Dear All,

just for the Emacs-users among you:
C-x 8  yields « and C-x 8  yields ».
For the Unix/Linux users it is possible to
setup or modify the keyboard layout using xmodmap.
Actually there are so many combinations of OS, keyboard layouts,
tools, editors and unicode encodings that this could become
quite an FAQ...
Btw. since it is favored that the default encoding for perl6
source code will be utf-8, it is not enough that you type something
that displays as « or ».  Your editor has to support utf-8 or
you need to have conversion tools to and from something that your
editor supports.
Best regards,

Karl



Re: Latin-1-characters

2004-03-16 Thread Karl Brodowsky
Dear All,

from what has been written by others, there are enough useful encodings other
than utf-8, utf-16/UCS-2 and UCS-4 that support efficient storage even
for unicode-files whose contents are Greek, Cyrillic, etc..  Sorry for the confusion
caused by the fact that I was not aware of these.
utf-8 is fine for languages like German, Polish, Norwegian, Spanish, 
French,...
which have = 90% of the text with ASCII-7-bit-characters.

Add perl to that list, by the way.  I rather strongly suspect that most 
perl code will consist mostly of 7-bit characters.  (Even perl code 
written by traditional-Chinese-speakers (and I pick on traditional 
Chinese only because it has a very large character repituar -- one of 
the reasons there's a simplified variant).)
My experience would be that Perl-programs do contain local language and thus
local characters which might be outside of ISO-646-IRV (7-bit-ASCII) for
String-literals and for comments.
By the way, there is (should be) nothing that is encodable in a 
non-Unicode character set that is not encodable in (any encoding of) 
Unicode.  That's where the uni bit comes from.  If there is, it's 
means that Unicode is not fulfilling it's design goals.
Yes, we can consider any file to be unicode with some encoding.  That is
how the Java-guys do it, with the restriction that they don't easily let
you choose anything other than latin-1 + \ucafe-stuff for non-latin-1
characters (or maybe I didn't bother, because latin-1/ISO-8859-1 works
fine for me).
IMHO the OS should provide a standard way to specify such a charset as 
a file attribute,
but usually it does not and it won't in the future, unless the file 
comes through the
network and has a Mime-Header.

I think the answer is multi-fold.

0) Auto-detect the encoding in the compiler, if a U+FFEF signature, or a 
#! signature, is found at the beginning of the input.  (If there is a 
FFEF signature, it should get thrown away after it is recognized.  It 
may be possible to recoginze on package or module as well, and 
possibly even on #.)
With FFFE and FEFF this seems obvious.  In case of #! it would not be clear
to me if this defaults to ISO-8859-1 (latin-1) or to utf-8.  See HTML
vs. XHTML as an example where the default has been changed.
1) Beleive what the underling FS/OS/transport tells us.  (This is likely 
to be a constant for many OSes, possibly selectable at the compiler's 
compile-time.  It's the encoding on the end of the content-type for HTTP 
and other MIME-based transports.)
I understand that the FS/OS do not really tell us, at least neither for
Unix/Linux nor for NT/Windows.  Relying on environment variables or locale
settings looks dangerous to me, because it breaks programs that worked fine
in environment A, when you run them elsewhere or it imposes restrictions
how to setup these environment variables.  It could be ok for one-liners
run from the command line like this
ls *.JPG|perl -p -e 's/(.*\.)JPG$/mv $1JPG $1jpg/;' |grep mv |sh
stuff.  This would work fine even for shell scripts, because they would have
to set the appropriate environment variables for themselves, thus disregarding
any user settings.  Probably something additional like PERL_DEFAULT_ENCODING,
because otherwise we might get clashes with (other) regular use of locale-settings.
In cases where the OS or FS really has a capability to provide encoding on a
per file basis as a file attribute or in cases where the file comes from the
network with a mime-header, your suggestion should be perfect.
2) Support a use encoding 'foo' similar to that in recent perl5s: It 
states the encoding that the file it appears in is written in.
Yes, that looks like the right way to do it.   And it eliminates part of the
concerns for 1), if it is assumed that this line use encoding is kind of required
in every non-trivial perl-source.  Btw. this is the encoding of the perl-source-code
itself, files that are processed by perl I/O could off course have any encoding.
(the higher-numbered sources of encoding information override the former 
ones.)
Yes, off course.  0) and 2) are obvious, but 1) might need to be dealt with carefully.

Best regards,

Karl



Re: Latin-1-characters

2004-03-15 Thread Karl Brodowsky
Mark J. Reed wrote:

Unicode per se doesn't do anything to file sizes; it's all in how you
encode it.
Yes.  And basically there are common ways to encode this: utf-8 and utf-16
(or similar variants requiring = 2 bytes per character)
The UTF-8 encoding is not so attractive in locales that make
heavy use of characters which require several bytes to encode therein, or
relatively little use of characters in the ASCII range;
utf-8 is fine for languages like German, Polish, Norwegian, Spanish, French,...
which have = 90% of the text with ASCII-7-bit-characters.
but that's why
there are other encoding schemes like SCSU which get you Unicode
compatibility while not taking up much more space than the locale's native 
charset.
These make sense for languages like Japanese, Korean, Chinese etc, where you need
more than one byte per character anyway.
But Russian, Greek, Hebrew, Arabic, Armenian and Georgian would work fine with one
byte per character.  But the kinds of of encoding that I can think of both make
this two bytes per character.  So for these I see file sizes doubled.  Or do I
miss something?  Anyway, it will be necessary to specify the encoding of unicode in
some way, which could possibly allow even to specify even some non-unicode-charsets.
IMHO the OS should provide a standard way to specify such a charset as a file 
attribute,
but usually it does not and it won't in the future, unless the file comes through the
network and has a Mime-Header.
Best regards,

Karl



Latin-1-characters

2004-03-12 Thread Karl Brodowsky
And I do think people would rebel at using Latin-1 for that one.
I get enough grief for   :-)
I can imagine that these cause some trouble with people using a charset
other than ISO-8859-1 (Latin-1) that works well with 8 bit, like Greek,
Arabic, Cyrillic and Hebrew.
For these guys Unicode is not so attractive, because it kind of doubles the size
of their files, so I would assume that they tend to do a lot of stuff with their
koi-8 or with some ISO-8859-x not containing the desired character.  For  it
might not be such a problem, because  would work instead.
Maybe this issue could (will?) be addressed by declaring the charset in the
source and using something like (or better than) \u00AB for stuff that this
charset does not have, using a charset-conversion to unicode while parsing
the source.  This looks somewhat cleaner to me than just pretending a source
file written in ISO-8859-7 (Greek) were ISO-8859-1 (Latin-1), relying on the
assumption that the two characters we use above 0x80 happen to be in
the same positions 0xab and 0xbb.
Sorry if that is an old story...

Best regards,

Karl