date:20170718

Re: Users of namedtuple: do you use the _source attribute?

2017-07-18 Thread Thomas Nyberg

On 07/19/2017 05:12 AM, Steve D'Aprano wrote:
> On Wed, 19 Jul 2017 08:39 am, Gregory Ewing wrote:
> Um... well, people want to do all sorts of wild and wacky things... but why
> would you define a named tuple with *private* fields? Especially since that
> privateness isn't enforced when you access the items by position.

Maybe the user wants to match a naming convention that already exists? I
am doing this in code I'm writing at the moment. I'm not using
namedtuples, but if I were it would be nice if I could match the
conventions from earlier.

> In any case, the namedtuple API prohibits that, so it isn't an option.

Of course the API could have been different.

I'm not saying I think that private fields should be allowed, but there
certainly are valid use cases.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Gregory Ewing


Chris Angelico wrote:

Once you NFC or NFD normalize both strings, identical strings will
generally have identical codepoints... You should then be able to use normal 
regular expressions to
match correctly.


Except that if you want to match a set of characters,
you can't reliably use [...], you would have to write
them out as alternatives in case some of them take
up more than one code point.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Steve D'Aprano

On Mon, 17 Jul 2017 04:12 am, Ben Finney wrote:

> Steven D'Aprano  writes:
> 
>> On Sun, 16 Jul 2017 12:33:10 +1000, Ben Finney wrote:
>>
>> > And yet the ASCII and Unicode standard says code point 0x0A (U+000A
>> > LINE FEED) is a character, by definition.
>> [...]
>> > > Is an acute accent a character?
>> > 
>> > Yes, according to Unicode. ‘´’ (U+0301 ACUTE ACCENT) is a character.
>>
>> Do you have references for those claims?
> 
> The Unicode Standard http://www.unicode.org/versions/Unicode10.0.0/>
> frequently uses “character” as the unit of semantic value that Unicode
> deals in. See the “Contents” table for many references.
> 
> In §2.2 under the sub-heading “Characters, Not Glyphs” it defines the
> term, and thereafter uses “character” in a way that includes all such
> units, even formatting codes.

Thanks for that. TIL something new.

I'm not sure whether I had misunderstood, or whether the standard has changed,
but I recall them previously being very reticent about giving a formal
definition for the term character. (Or possibly a combination of both.)

Even now, they do seem to prefer to use "character" in the sense of an abstract
character, not necessarily something that ordinary users of language will
recognise as a character or letter. E.g. they include control codes, variation
codes, diacritic marks on their own with no base, and more.

Unicode defines exactly 66 noncharacters:

http://www.unicode.org/faq/private_use.html#noncharacters

I found the table on page 30 here:

http://www.unicode.org/versions/Unicode10.0.0/ch02.pdf#G25564

very useful. That helped to clarify my thinking.

-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Steve D'Aprano

On Wed, 19 Jul 2017 10:34 am, Mikhail V wrote:

> Ok, in this narrow context I can also agree.
> But in slightly wider context that phrase may sound almost like:
> "neither geometrical shape is better than the other as a basis
> for a wheel. If you have polygonal wheels, they are still called wheels."

I'm not talking about wheels, I'm talking about writing systems which are
fundamentally collections of arbitrary shapes. There's nothing about the sound
of "f" that looks like the letter "f".

But since you mentioned non-circular wheels, such things do exist, and are still
called "wheels" (or "gears", which is a kind of specialised wheel).

https://eric.ed.gov/?id=EJ937593

https://en.wikipedia.org/wiki/Non-circular_gear

https://en.wikipedia.org/wiki/Square_wheel

https://www.youtube.com/watch?v=vk7s4PfvCZg

-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Steve D'Aprano

On Wed, 19 Jul 2017 10:08 am, Ben Finney wrote:

> Gregory Ewing  writes:
> 
>> The term "emoji" is becoming rather strained these days.
>> The idea of "woman" and "personal computer" being emotions
>> is an interesting one...
> 
> I think of “emoji” as “not actually a character in any system anyone
> would use for writing anything, but somehow gets to squat in the Unicode
> space”.

Blame the Japanese mobile phone manufacturers. They want to include emoji in
their SMSes and phone chat software, and have the money to become full members
of the Unicode Consortium.

I suppose that having a standard for emoji is good. I'm not convinced that
Unicode should be that standard, but on the other hand if we agree that Unicode
should support hieroglyphics and pictographs, well, that's exactly what emoji
are.

-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Users of namedtuple: do you use the _source attribute?

2017-07-18 Thread Steve D'Aprano

On Wed, 19 Jul 2017 08:39 am, Gregory Ewing wrote:

> Steve D'Aprano wrote:
>> "source_" is already a public name, which means that users could want to
>> create fields with that name for some reason,
> 
> They could equally well want to define their own private
> field called "_source".

Um... well, people want to do all sorts of wild and wacky things... but why
would you define a named tuple with *private* fields? Especially since that
privateness isn't enforced when you access the items by position.

In any case, the namedtuple API prohibits that, so it isn't an option.

-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Steve D'Aprano

On Wed, 19 Jul 2017 12:10 am, Rustom Mody wrote:

> On Monday, July 17, 2017 at 10:14:00 PM UTC+5:30, Rhodri James wrote:
>> On 17/07/17 05:10, Rustom Mody wrote:
>> > Hint1: Ask your grandmother whether unicode's notion of character makes
>> > sense. Ask 10 gmas from 10 language-L's
>> > Hint2: When in doubt gma usually is right
>> 
>> "For every complex problem there is an answer that is clear, simple and
>> wrong." (H.L. Mencken).
> 
> Great men galore with great quotes galore²
> Here are 3 — take your pick:
> 
> Einstein:
> If you can't explain something to a six-year-old, you really don't understand
> it yourself.
> 
> [Commonly attributed to Einstein
> More likely Feynman, Rutherford, de Broglie or some other notable physicist
>
https://skeptics.stackexchange.com/questions/8742/did-einstein-say-if-you-cant-explain-it-simply-you-dont-understand-it-well-en
> ]

More likely none of the above, but invented by some non-expert who wanted to put
down the value of expert knowledge, and thought a bogus argument by authority
was the best way to do it. (Einstein said it, therefore it must be right!)

Think about it: it simply is nonsense. If this six year old test was valid, that
would imply that all fields of knowledge are capable of being taught to the
average six year old. Yeah good luck with that.

But even if we accept this, it doesn't contradict the Mencken quote. I can
explain the birds and the bees to a six year, at a level that they will
understand. That doesn't mean that (1) I am an expert on human reproduction; or
that (2) people should ask the six year old for advice about human
reproduction.

The second part is the problem. I understand how cars work, to an acceptable
degree that I could probably explain it to a six year old. But if you came to
me to ask my advice about buying a car, or repairing a car, you'll get bad
advice. I'm not an expert and I don't know enough to give *good* advice. Same
with your "grandmother" test.

Yes, I'm sure that most "grandmothers" (I know that's just shorthand for 
"regular people who aren't experts") will have an intuitive idea of what a
character is. But what on earth makes you think that intuitive idea is both
*necessary and sufficient* for programming?

> Dijkstra:
> 
> Programming languages belong to the problem set, not (as some imagine)
> to the solution set
> https://www.cs.utexas.edu/users/EWD/transcriptions/EWD04xx/EWD473.html

Relevance?

That's just the "Now you have two problems" observation, reworded for
programming languages in general rather than just regular expressions. How is
it relevant?

It seems to me that you are just tossing random quotes out in the hope that some
of them might stick. Two can play at that game:

"He who questions training only trains himself at asking questions." 
- The Sphinx

"Must … defy … laws … of … physics …"
- The Tick

"Whenever Giles sends me on a mission, he always says 'please'. And afterwards I
get a cookie."
- Buffy the Vampire Slayer

The bottom line is, your "grandma" test dismisses the value of expert domain
knowledge. As programmers, we need access to expert domain knowledge, even if
we don't hold it ourselves, we need to trust that the people who wrote our
libraries had it.

-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Steve D'Aprano

On Wed, 19 Jul 2017 12:29 am, Random832 wrote:

> On Sun, Jul 16, 2017, at 01:37, Steven D'Aprano wrote:
>> In a *well-designed* *bug-free* monospaced font, all code points should
>> be either zero-width or one column wide. Or two columns, if the font
>> supports East Asian fullwidth characters.
> 
> What about Emoji?
> U+1F469 WOMAN is two columns wide on its own.
> U+1F4BB PERSONAL COMPUTER is two columns wide on its own.
> U+200D ZERO WIDTH JOINER is zero columns wide on its own.


What about them? In a monospaced font, they should follow the same rules I used
above: either 0, 1 or 2 column wide.

If any visible code point is a fraction of a column wide, it isn't usable as a
monospaced font.


-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Steve D'Aprano

On Tue, 18 Jul 2017 11:59 pm, Chris Angelico wrote:

>> (I don't think any native English words use a double-V or double-U, but the
>> possibility exists.)
> 
> vacuum.

Nice. Also continuum and residuum.

For double V, we have savvy, skivvy, flivver (an old slang term for cars).

-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Rustom Mody

On Wednesday, July 19, 2017 at 3:00:21 AM UTC+5:30, Marko Rauhamaa wrote:
> Chris Angelico :
> 
> > Let me give you one concrete example: the letter "ö". In English, it
> > is (very occasionally) used to indicate diaeresis, where a pair of
> > letters is not a double letter - for example, "coöperate". (You can
> > also hyphenate, "co-operate".) In German, it is the letter "o" with a
> > pronunciation mark (umlaut), and is considered the same letter as "o".
> > In Swedish, it is a distinct letter, alphabetized last (following z,
> > å, and ä, in that order). But in all these languages, it's represented
> > the exact same way.
> 
> The German Wikipedia entry on "ä" calls "ä" a letter ("Buchstabe"):
> 
>Der Buchstabe Ä (kleingeschrieben ä) ist ein Buchstabe des
>lateinischen Schriftsystems.
> 
> Furthermore, it makes a distinction between "ä" the letter and "ä" the
> "a with a diaeresis:"
> 
>In guten Druckschriften unterscheiden sich die Umlautpunkte von den
>zwei Punkten des Tremas: Die Umlautpunkte sind kleiner, stehen näher
>zusammen und liegen etwas tiefer.
> 
>In good fonts umlaut dots are different from the two dots of a
>diaeresis: the umlaut dots are smaller and closer to each other and
>lie a little lower. [translation mine]
> 

Very interesting!
And may I take it that the two different variants — u-umlaut and u-diaresis — 
of ü are not (yet) given a seat in unicode?

Now compare with:
- hyphen-minus 0x2D
− minus sign 0x2212
‐ hyphen 0x2010
– en dash 0x2013
— em dash 0x2014
― horizontal bar 0x2015
… And perhaps another half-dozen
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Steve D'Aprano

On Wed, 19 Jul 2017 12:09 am, Random832 wrote:

> On Fri, Jul 14, 2017, at 08:33, Chris Angelico wrote:
>> What do you mean about regular expressions? You can use REs with
>> normalized strings. And if you have any valid definition of "real
>> character", you can use it equally on an NFC-normalized or
>> NFD-normalized string than any other. They're just strings, you know.
> 
> I don't understand how normalization is supposed to help with this. It's
> not like there aren't valid combinations that do not have a
> corresponding single NFC codepoint (to say nothing of the situation with
> e.g. Indic languages).

Normalisation helps. Suppose you want to search for é for example, a naive
regular expression engine will only find the exact representation you or your
editor happened to use:

U+00E9 LATIN SMALL LETTER E WITH ACUTE

or 

U+0065 LATIN SMALL LETTER E + U+0301 COMBINING ACUTE ACCENT

but not both. By normalising, you ensure that both the text you are searching
and the regex you are searching for are in the same state: either composed to a
single code point U+00E9 or decomposed to two U+0065,0301 but never one in one
state and the other in the other.

For characters that don't include a canonical composition form, then there's no
problem: you will always be searching for a decomposed character using a base
character followed by combining characters, so there is no discrepancy and it
will just work.

> In principle probably a viable solution for regex would be to add
> character classes for base and combining characters, and then
> "[[:base:]][[:combining:]]*" can be used as a building block if
> necessary.

I don't know what that means.

Any code point (except for combining characters themselves) can be used as the
base, and the various kinds of combining characters have the Unicode category
property:

Mn (Mark, nonspacing)
Mc (Mark, spacing combining)
Me (Mark, enclosing)

If we're talking about combining accents and diacritics, the one we want is Mc.

But generally, we're not after "any old diacritic", we're after a specific one,
on a specific base.

-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Chris Angelico

On Wed, Jul 19, 2017 at 10:34 AM, Mikhail V  wrote:
> Ok, in this narrow context I can also agree.
> But in slightly wider context that phrase may sound almost like:
> "neither geometrical shape is better than the other as a basis
> for a wheel. If you have polygonal wheels, they are still called wheels."

I don't think he meant that. (Anyway, what shape IS a .whl file?)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Mikhail V

ChrisA wrote:
>On Wed, Jul 19, 2017 at 6:05 AM, Mikhail V  wrote:
>> On 2017-07-18, Steve D'Aprano  wrote:
>>
>>> That's neither better nor worse than the system used by English and French,
>>> where letters with dicritics are not distinct letters, but guides to
>>> pronunciation.
>>
>>>_Neither system is right or wrong, or better than the other._
>>
>>
>> If that is said just "not to hurt anybody" then its ok.
>> Though this statement is pretty absurd, not so many
>> (intelligent) people will buy this out today.

>Let me give you one concrete example: the letter "ö". In English, it
>is (very occasionally) used to indicate diaeresis, where a pair of
>letters is not a double letter - for example, "coöperate". (You can
>also hyphenate, "co-operate".) In German, it is the letter "o" with a
>pronunciation mark (umlaut), and is considered the same letter as "o".
>In Swedish, it is a distinct letter, alphabetized last (following z,
>å, and ä, in that order). But in all these languages, it's represented
>the exact same way.
>
>Steven is pointing out that there's nothing fundamentally wrong about
>using "ö" as a unique letter, nor is there anything fundamentally
>wrong about using it as "o" with a pronunciation mark. Which I agree
>with.
>

Ok, in this narrow context I can also agree.
But in slightly wider context that phrase may sound almost like:
"neither geometrical shape is better than the other as a basis
for a wheel. If you have polygonal wheels, they are still called wheels."


Mikhail
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Ben Finney

Gregory Ewing  writes:

> The term "emoji" is becoming rather strained these days.
> The idea of "woman" and "personal computer" being emotions
> is an interesting one...

I think of “emoji” as “not actually a character in any system anyone
would use for writing anything, but somehow gets to squat in the Unicode
space”.

-- 
 \“The priesthood have, in all ancient nations, nearly |
  `\ monopolized learning.” —John Adams, _Letters to John Taylor_, |
_o__) 1814 |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list

Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Mikhail V

Marko Rauhamaa wrote:

>What did you think of my concrete examples, then? (Say, finding
>"Alvárez" with the regular expression "Alv[aá]rez".)

I think that should match both "Alvarez" and "Alvárez" ...?
But firstly, I feel like I need to _guess_ what ideas you
are presenting. Unless I open up Vim and apply my imagination,
it is hard even to get involved in your ideas.
I wonder why it is hard to elaborate a pair
of examples like e.g. :
- now the task A (concrete task defined) is solved with the code C1
- with the new syntax/method, the same task could be solved with the code C2

Just trying to guess related tasks:
For the automation of regex search-related tasks I would make a function
which generates the RE pattern first, i.e. define tables with
"variations" for glyphs, e.g. groups={"a": "aá"} or similar.
Then I'll need some micro-syntax for the conversion,
e.g. generate_re("Alv{a}rez", groups)

Intuitively, I suppose the groupings and even
the functions hardly can be standardized in a nice manner,
since I'll need to define and redefine them all the time for various cases.
But probably there can be some generality, hard to say.

What I need often is the "approximate" search function,
which returns a match "similar" to the input string. But I think even
the regex module
cannot fully solve this and I would end up with a function
which goes through each string element and calculate various
similarity criteria.

Mikhail
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Gregory Ewing


Random832 wrote:

What about Emoji?
U+1F469 WOMAN is two columns wide on its own.
U+1F4BB PERSONAL COMPUTER is two columns wide on its own.


The term "emoji" is becoming rather strained these days.
The idea of "woman" and "personal computer" being emotions
is an interesting one...

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Gregory Ewing


Steve D'Aprano wrote:

(I don't think any native English words use a double-V or double-U, but the
possibility exists.)


vacuum
savvy

(Vacuum is arguably Latin, but we've been using it for long
enough that it's at least as English as most of the other
words we use.)

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list

Re: Users of namedtuple: do you use the _source attribute?

2017-07-18 Thread Gregory Ewing


Steve D'Aprano wrote:

"source_" is already a public name, which means that users could want to create
fields with that name for some reason,


They could equally well want to define their own private
field called "_source".

IMO a better thing to do would have been to name
it "__source__". Dunder names are officially reserved
for use by the language or stdlib.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list

Problem in installing module "pynamical"

2017-07-18 Thread Saikat Chakraborty

  I am using PyCharm Community Edition 2017 with interpreter python 3.6.1.
I want to install pynamical module.
But it is showing error. I am posting the error message:

E:\untitled>pip install pynamical

 FileNotFoundError: [WinError 2] The system cannot find the file specified
error: command
'c:\\users\\s.chakraborty\\appdata\\local\\programs\\python\\python36\\python.exe'
failed with exit status 1

Please give me a solutioin.

Thanking you.
-- 
  With Regards
  Saikat Chakraborty
 (Doctoral Research Scholar)
  *Computer Science & Engineering Dept.*
*NIT Rourkela,Rourkela,Orissa, India*
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Anders Wegge Keller

På Tue, 18 Jul 2017 11:27:03 -0400
Dennis Lee Bieber  skrev:

>   Probably would have to go to words predating the Roman occupation
> (which probably means a dialect closer to Welsh or other Gaelic).
> Everything later is an import (anglo-saxon being germanic tribes invading
> south, Vikings in the central area, as I recall southern Irish displacing
> Picts in Scotland, and then the Norman French (themselves starting from
> Vikings ["nor(se)man"]).

 English is known to be lurking in back alleys, waiting for unsuspecting
 languages, that can be beat up for loose vocabulary. So defining anything
 "pure" about it, is going to be practically impossible.

-- 
//Wegge
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Marko Rauhamaa

Chris Angelico :

> Let me give you one concrete example: the letter "ö". In English, it
> is (very occasionally) used to indicate diaeresis, where a pair of
> letters is not a double letter - for example, "coöperate". (You can
> also hyphenate, "co-operate".) In German, it is the letter "o" with a
> pronunciation mark (umlaut), and is considered the same letter as "o".
> In Swedish, it is a distinct letter, alphabetized last (following z,
> å, and ä, in that order). But in all these languages, it's represented
> the exact same way.

The German Wikipedia entry on "ä" calls "ä" a letter ("Buchstabe"):

   Der Buchstabe Ä (kleingeschrieben ä) ist ein Buchstabe des
   lateinischen Schriftsystems.

Furthermore, it makes a distinction between "ä" the letter and "ä" the
"a with a diaeresis:"

   In guten Druckschriften unterscheiden sich die Umlautpunkte von den
   zwei Punkten des Tremas: Die Umlautpunkte sind kleiner, stehen näher
   zusammen und liegen etwas tiefer.

   In good fonts umlaut dots are different from the two dots of a
   diaeresis: the umlaut dots are smaller and closer to each other and
   lie a little lower. [translation mine]


(My native Finnish has the "ä" as well; the German tradition of placing
the dots next to the body of the "a" looks a bit unpleasant. On the
other hand, so does the English tradition of hanging the dots high up in
the air.)


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Chris Angelico

On Wed, Jul 19, 2017 at 6:05 AM, Mikhail V  wrote:
> On 2017-07-18, Steve D'Aprano  wrote:
>
>> That's neither better nor worse than the system used by English and French,
>> where letters with dicritics are not distinct letters, but guides to
>> pronunciation.
>
>>_Neither system is right or wrong, or better than the other._
>
>
> If that is said just "not to hurt anybody" then its ok.
> Though this statement is pretty absurd, not so many
> (intelligent) people will buy this out today.

Let me give you one concrete example: the letter "ö". In English, it
is (very occasionally) used to indicate diaeresis, where a pair of
letters is not a double letter - for example, "coöperate". (You can
also hyphenate, "co-operate".) In German, it is the letter "o" with a
pronunciation mark (umlaut), and is considered the same letter as "o".
In Swedish, it is a distinct letter, alphabetized last (following z,
å, and ä, in that order). But in all these languages, it's represented
the exact same way.

Steven is pointing out that there's nothing fundamentally wrong about
using "ö" as a unique letter, nor is there anything fundamentally
wrong about using it as "o" with a pronunciation mark. Which I agree
with.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Mikhail V

On 2017-07-18, Steve D'Aprano  wrote:

> That's neither better nor worse than the system used by English and French,
> where letters with dicritics are not distinct letters, but guides to
> pronunciation.

>_Neither system is right or wrong, or better than the other._

If that is said just "not to hurt anybody" then its ok.
Though this statement is pretty absurd, not so many
(intelligent) people will buy this out today.

Mikhail
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: pyserial and end-of-line specification

2017-07-18 Thread FS

Thank you for your response Andre. I had tried some code like that in the 
document but it did not seem to work. However ever leaving my terminal for a 
time the code eventually wrote out the records so apparently there is some very 
deep buffering going on here. A little more searching on the web revealed the 
following:

https://stackoverflow.com/questions/10222788/line-buffered-serial-input

It is apparent that pySerial, or at least the documentation is falling short of 
my needs. It is very unclear what module in the layer is handling the buffering 
and newlines and so forth. Also unclear is whether the coupled python and OS is 
reading FIFO or LIFO--something important in quasi realtime scientific 
applications.
  This is problematic since the serial port is still so ubiquitous to a lot of 
scientific instrumentation. I probably will patch up some byte oriented code 
for this or perhaps write the module in C.

Thanks again
Fritz
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Chris Angelico

On Wed, Jul 19, 2017 at 4:56 AM, Marko Rauhamaa  wrote:
> Chris Angelico :
>> What I *think* you're asking for is for square brackets in a regex to
>> count combining characters with their preceding base character.
>
> Yes. My example tries to match a single character against a single
> character.
>
>> That would make a lot of sense, and would actually be a reasonable
>> feature to request. (Probably as an option, in case there's a backward
>> compatibility issue.)
>
> There's the flag re.IGNORECASE. In the same vein, it might be useful to
> have re.IGNOREDIACRITICS, which would match
>
>re.match("^[abc]$", "ä", re.IGNOREDIACRITICS)
>
> regardless of normalization.

That's a different feature, and can be achieved with a different normalization:

def fold(s):
"""Fold a string for 'search compatibility'.

Returns a modified version of s with no diacriticals.
"""
s = s.casefold()
s = unicodedata.normalize("NFKD", s)
s = ''.join(c for c in s if c < '\u0300' or c > '\u033f')
return unicodedata.normalize("NFKC", s)

This is something that you might use when searching, as people will
expect to be able to type "cafe" to fine "café". It is deliberately
lossy.

But having the re module group code units into logical characters
according to 'base + combining' is a different feature. It may be
worth adding. I don't think your re.IGNOREDIACRITICS is something that
belongs in the stdlib, as different search contexts require different
folding (Google, for instance, will find "ı" when you search for "i" -
but then, Google also finds "python" when you search for "phyton").

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Marko Rauhamaa

Chris Angelico :

> On Wed, Jul 19, 2017 at 4:31 AM, Marko Rauhamaa  wrote:
>> Chris Angelico :
>>
>>> On Wed, Jul 19, 2017 at 3:01 AM, Marko Rauhamaa  wrote:
 Yes. Also, not every letter can be normalized to a single codepoint so
 NFC is not a way out. For example,

 re.match("^[q̈]$", "q̈")

 returns None regardless of normalization.
> [...]
>
> What I *think* you're asking for is for square brackets in a regex to
> count combining characters with their preceding base character.

Yes. My example tries to match a single character against a single
character.

> That would make a lot of sense, and would actually be a reasonable
> feature to request. (Probably as an option, in case there's a backward
> compatibility issue.)

There's the flag re.IGNORECASE. In the same vein, it might be useful to
have re.IGNOREDIACRITICS, which would match

   re.match("^[abc]$", "ä", re.IGNOREDIACRITICS)

regardless of normalization.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Chris Angelico

On Wed, Jul 19, 2017 at 4:31 AM, Marko Rauhamaa  wrote:
> Chris Angelico :
>
>> On Wed, Jul 19, 2017 at 3:01 AM, Marko Rauhamaa  wrote:
>>> Yes. Also, not every letter can be normalized to a single codepoint so
>>> NFC is not a way out. For example,
>>>
>>> re.match("^[q̈]$", "q̈")
>>>
>>> returns None regardless of normalization.
>>
>> In what language or context would you actually want to do this?
>
> I could have picked more realistic examples: Classic Greek or Hebrew,
> for example.
>
> However, someone might actually use even "q̈" in a real setting. First of
> all, it *is* a legal character. Secondly, people sometimes combine
> characters in an ad-hoc fashion. Thirdly, remember the case of
> Esperanto, which blessed the world with the letters
>
>ĉ ĝ ĥ ĵ ŝ ŭ
>
> Esperanto's venerable history finally awarded those characters a
> code-point status in Unicode. However, around the year 2000, it was
> still commonplace to use all sorts of tricks to type them on the
> Internet:
>
>ch gh hh jj sh u
>
>^c ^g ^h ^j ^s ^u
>
>cx gx hx jx sx ux
>
> For all we know, someone somewhere might be cooking up a language that
> depends on "q̈".

Sure. And if they do, they'll have to contend with the fact that it's
going to be represented as multiple code units.

What I *think* you're asking for is for square brackets in a regex to
count combining characters with their preceding base character. That
would make a lot of sense, and would actually be a reasonable feature
to request. (Probably as an option, in case there's a backward
compatibility issue.)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Marko Rauhamaa

Chris Angelico :

> On Wed, Jul 19, 2017 at 3:01 AM, Marko Rauhamaa  wrote:
>> Yes. Also, not every letter can be normalized to a single codepoint so
>> NFC is not a way out. For example,
>>
>> re.match("^[q̈]$", "q̈")
>>
>> returns None regardless of normalization.
>
> In what language or context would you actually want to do this?

I could have picked more realistic examples: Classic Greek or Hebrew,
for example.

However, someone might actually use even "q̈" in a real setting. First of
all, it *is* a legal character. Secondly, people sometimes combine
characters in an ad-hoc fashion. Thirdly, remember the case of
Esperanto, which blessed the world with the letters

   ĉ ĝ ĥ ĵ ŝ ŭ

Esperanto's venerable history finally awarded those characters a
code-point status in Unicode. However, around the year 2000, it was
still commonplace to use all sorts of tricks to type them on the
Internet:

   ch gh hh jj sh u

   ^c ^g ^h ^j ^s ^u

   cx gx hx jx sx ux

For all we know, someone somewhere might be cooking up a language that
depends on "q̈".

Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

RE: Best way to assert unit test cases with many conditions

2017-07-18 Thread Dan Strohl via Python-list

Ganesh;

I'm not 100% sure what you are trying to do.. so let me throw out a few things 
I do and see if that helps...

If you are trying to run a bunch of similar tests on something, changing only 
(or mostly) in the parameters passed, you can use self.subTest().

Like this:

Def test_this(self):
For i in range(10):
with self.subTest('test number %s) % i):
self.assertTrue(I <= 5)

With the subTest() method, if anything within that subTest fails, it won't stop 
the process and will continue with the next step.

If you are trying to run a single test at the end of your run to see if 
something messed something up (say, corrupted a file or something), you can, 
(at least with the default unittest) name your test something like 
test_zzz_do_this_at_end, and unless you have over-ridden how the tests are 
being handled (or are using a different testing environment), unittest should 
run it last (of the ones in that TestCase class).

From: https://docs.python.org/2/library/unittest.html#organizing-test-code  
"Note that the order in which the various test cases will be run is determined 
by sorting the test function names with respect to the built-in ordering for 
strings."

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Chris Angelico

On Wed, Jul 19, 2017 at 3:01 AM, Marko Rauhamaa  wrote:
> Chris Angelico :
>
>> what you're more likely to want is "match the letter á", and you don't
>> care whether it's represented as U+0061 U+0301 or as U+00E1. That's
>> where Unicode normalization comes in.
>
> Yes. Also, not every letter can be normalized to a single codepoint so
> NFC is not a way out. For example,
>
> re.match("^[q̈]$", "q̈")
>
> returns None regardless of normalization.

In what language or context would you actually want to do this?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Best way to assert unit test cases with many conditions

2017-07-18 Thread Rob Gaddi


On 07/18/2017 09:56 AM, Ganesh Pal wrote:


  (1)  should I add several asserts per test case, or  just warn with the
error and fail at the end .  In the line 33 – 35 / 37-38 ( sorry this is a
dirty pusedo-code)  .


Yes.  Just assert each thing as it needs asserting.



(2)  Is there a way we can warn the test using assert method and not fail?
I was trying to see if I could use  assertWarns but the  help says that
  “The test passes if warning is triggered and fails if it isn’t “.

I don’t want to fail on warning but just continue which next checks



You can, but you're just going to complicate your life.  A "test" is a 
thing that passes (all) or fails (any).  If you need it to keep going 
after a failure, what you have are two tests.  There's nothing wrong 
with having a whole mess of test functions.  If there's a lot of common 
code there you'd have to replicate, that's what setUp() is for.  If 
there are several different flavors of common code you need, you can 
create a base TestCase subclass and then derive further subclasses from 
that.


Do the things the way the tools want to do them.  Unit testing is enough 
of a pain without trying to drive nails with the butt of a screwdriver.


--
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order.  See above to fix.
--
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Grant Edwards

On 2017-07-18, Anders Wegge Keller  wrote:
> På Tue, 18 Jul 2017 23:59:33 +1000
> Chris Angelico  skrev:
>> On Tue, Jul 18, 2017 at 11:11 PM, Steve D'Aprano
>
>
>>> (I don't think any native English words use a double-V or double-U, but
>>> the possibility exists.)  
>  
>> vacuum.
>
>  That's latin. 

If you want to play that game, there are no native English words that
contain any of the letters A-Z either.  It turns out they're all
german or frisan or norse or french or whatever...

-- 
Grant Edwards   grant.b.edwardsYow! Can you MAIL a BEAN
  at   CAKE?
  gmail.com

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Rhodri James


On 18/07/17 17:03, Marko Rauhamaa wrote:

Random832:


As for double-v, a quick search through /usr/share/dict/words reveals
"civvies", "divvy", "revved/revving", "savvy" and "skivvy", and
various conjugations thereof. All following, more or less, the rule of
using a double consonant after a short vowel in contexts where a
single consonant would suggest the preceding vowel was long.

The single/double consonant rule is indeed an ancient Germanic spelling
principle. English makes several twists to the it:


It's not so much a rule as a guideline...

--
Rhodri James *-* Kynesim Ltd
--
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Marko Rauhamaa

Marko Rauhamaa :

>  * the final consonant of a single-syllable word is doubled only if the
>consonant is "k", "l" or "s" ("kick", "kill", "kiss")

... or "f" ("stiff") or "z" ("buzz")


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Marko Rauhamaa

Chris Angelico :

> what you're more likely to want is "match the letter á", and you don't
> care whether it's represented as U+0061 U+0301 or as U+00E1. That's
> where Unicode normalization comes in.

Yes. Also, not every letter can be normalized to a single codepoint so
NFC is not a way out. For example,

re.match("^[q̈]$", "q̈")

returns None regardless of normalization.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Best way to assert unit test cases with many conditions

2017-07-18 Thread Ganesh Pal

Hi  Dear Python Friends,



 The  unittest’s TestCase
 class
provides several assert methods to check for and report failures . I need
suggestion what would the best way to assert  test cases  in the below
piece of code.



 (1)  should I add several asserts per test case, or  just warn with the
error and fail at the end .  In the line 33 – 35 / 37-38 ( sorry this is a
dirty pusedo-code)  .



(2)  Is there a way we can warn the test using assert method and not fail?
I was trying to see if I could use  assertWarns but the  help says that
 “The test passes if warning is triggered and fails if it isn’t “.

   I don’t want to fail on warning but just continue which next checks



(3)  All more ways to optimize the sample code.



1 import unittest

  2 import library

  3

  4

  5 class AutoRepairFilesystem(unittest.TestCase):

  6

  7 blocks = {}

  8 report = ""

  9

 10 @classmethod

 11 def setUpClass(self):

 12 """

 13 Set UP

 14 """

 15 logging.info("SETUP.Started")

 16 try:

 17 self.blocks['test01'] = library.inject_corruption1(file1)

 18 self.blocks['test100'] =
library.inject_corruption100(file100)

 19

 20 except Exception as e:

 21 logging.error("Failure injection failed \n")

 22 raise

 23

 24 if not library.check_Repair():

 25 logging.error("Failed running FSCK Tool ")

 26 assert False, "Pre-test checks in setUpClass failed
skipping test"

 27 logging.info("SETUP.Done")

 28

 29 def test_corruption1(self):

 30 """Run test no 1 """

 31 # This was the only earlier condition then!

 32
#self.assertTrue(library.log_message_is_reported(self.report,self.blocks['test01']):'''

 33 if not library.log_message_is_reported(self.report,

 34self.blocks['test01']):

 35 print "Warning: Reporting Failed \n"

 36

 37 if not library.is_corruption_fixed():

 38 print "Warning: Corruption is not fixed  \n"

 39

 40 if not library.is_corruption_reparied():

 41 assert False, "Corruption not reported,fixed and auto
repaired.\n"

 42

 43 def test_corruption100(self):

 44 """ Run test no 100 """

 45 if not library.log_message_is_reported(self.report,

 46self.blocks['test100']):

 47 print "Warning: Reporting Failed \n"

 48

 49 if not library.is_corruption_fixed():

 50 print "Warning: Corruption is not fixed  \n"

 51

 52 if not library.is_corruption_reparied():

 53 assert False, "Corruption not reported,fixed and auto
repaired.\n"

 54

 55 @classmethod

 56 def tearDownClass(self):

 57 """ Delete all files """

 58 os.system("rm -rf /tmp/files/")

 59

 60 if __name__ == '__main__':

 61 unittest.main()





I am a Linux user with Python 2.7.



Regards,

Ganesh
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Better Regex and exception handling for this small code

2017-07-18 Thread Ganesh Pal

Thanks Cameron Simpson for you suggestion  and reply quite helpful :)

On Wed, Jul 12, 2017 at 5:06 AM, Cameron Simpson  wrote:

> On 11Jul2017 22:01, Ganesh Pal  wrote:
>
>> I am trying to open a file and check if there is a pattern  has  changed
>> after the task got completed?
>>
>> file data:
>> 
>>
>> #tail -f /file.txt
>> ..
>> Note: CRC:algo = 2, split_crc = 1, unused = 0, initiator_crc = b6b20a65,
>> journal_crc = d2097b00
>> Note: Task completed successfully.
>> Note: CRC:algo = 2, split_crc = 1, unused = 0, initiator_crc = d976d35e,
>> journal_crc = a176af10
>>
>>
>> I  have the below piece of code  but would like to make this better more
>> pythonic , I found regex pattern and exception handling poor here , any
>> quick suggestion in your spare time is welcome.
>>
>>
>> #open the existing file if the flag is set and check if there is a match
>>
>> log_file='/file.txt'
>> flag_is_on=1
>>
>
> Use "True" instead of "1". A flag is a Boolean thing, and should use a
> Boolean value. This lets you literally speak "true" and 'false" rather than
> imoplicitly saying that "0 means false and nonzero means true".
>
> data = None
>>
>
> There is no need to initialise data here because you immediately overwrite
> it below.
>
> with open(log_file, 'r') as f:
>> data = f.readlines()
>>
>> if flag_is_on:
>>
>
> Oh yes. Just name this variable "flag". "_is_on" is kind of implicit.
>
>logdata = '\n'.join(data)
>>
>
> Do other parts of your programme deal with the file data as lines? If not,
> there is little point to reading the file and breaking it up into lines
> above, then joining them together against here. Just go:
>
>  with open(log_file) as f:
>  log_data = f.read()
>
>reg = "initiator_crc =(?P[\s\S]*?), journal_crc"
>>
>
> Normally we write regular expressions as "raw" python strings, thus:
>
>reg = r'initiator_crc =(?P[\s\S]*?), journal_crc'
>
> because backslashes etc are punctuation inside normal strings. Within a
> "raw" string started with r' nothing is special until the closing '
> character. This makes writing regular expressions more reliable.
>
> Also, why the character range "[\s\S]"? That says whitespace or
> nonwhitespace i.e. any character. If you want any character, just say ".".
>
>crc = re.findall(re.compile(reg), logdata)
>>
>
> It is better to compile a regexp just the once, getting a Regexp object,
> and then you just use the compiled object.
>
>if not crc:
>>raise Exception("Pattern not found in  logfile")
>>
>
> ValueError would be a more appropriate exception here; plain old
> "Exception" is pretty vague.
>
>checksumbefore = crc[0].strip()
>>checksumafter = crc[1].strip()
>>
>
> Your regexp cannot start or end with whitespace. Those .strip calls are
> not doing anything for you.
>
> This reads like you expect there to be exactly 2 matches in the file. What
> if there are more or fewer?
>
>logging.info("checksumbefore :%s and  checksumafter:%s"
>>  % (checksumbefore, checksumafter))
>>
>>if checksumbefore == checksumafter:
>>   raise Exception("checksum not macthing")
>>
>
> Don't you mean != here?
>
> I wouldn't be raising exceptions in this code. Personally I would make
> this a function that returns True or False. Exceptions are a poor way of
> returning "status" or other values. They're really for "things that should
> not have happened", hence their name.
>
> It looks like you're scanning a log file for multiple lines and wanting to
> know if successive ones change. Why not write a function like this
> (untested):
>
>  RE_CRC_LINE = re.compile(r'initiator_crc =(?P[\s\S]*?),
> journal_crc')
>
>  def check_for_crc_changes(logfile):
>  old_crc_text = ''
>  with open(logfile) as f:
>  for line in f:
>  m = RE_CRC_LINE.match(line)
>  if not m:
>  # uninteresting line
>  continue
>  crc_text = m.group(0)
>  if crc_text != old_crc_text:
>  # found a change
>  return True
>  if old_crc_text == '':
>  # if this is really an error, you might raise this exception
>  # but maybe no such lines is just normal but boring
>  raise ValueError("no CRC lines seen in logfile %r" % (logfile,))
>  # found no changes
>  return False
>
> See that there is very little sanity checking. In an exception supporting
> language like Python you can often write code as if it will always succeed
> by using things which will raise exceptions if things go wrong. Then
> _outside_ the function you can catch any exceptions that occur (such as
> being unable to open the log file).
>
> Cheers,
> Cameron Simpson 
>
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Chris Angelico

On Wed, Jul 19, 2017 at 1:40 AM, Rhodri James  wrote:
> On 18/07/17 16:27, Dennis Lee Bieber wrote:
>>
>> On Tue, 18 Jul 2017 10:38:48 -0400, Random832 
>> declaimed the following:
>>
>>> Define "native" then. My interpretation of "native English words" is
>>> "anything you wouldn't have to put in italics to use in a sentence".
>>> Which would also include "continuum".
>>>
>>
>> Probably would have to go to words predating the Roman occupation
>> (which probably means a dialect closer to Welsh or other Gaelic).
>> Everything later is an import (anglo-saxon being germanic tribes invading
>> south, Vikings in the central area, as I recall southern Irish displacing
>> Picts in Scotland, and then the Norman French (themselves starting from
>> Vikings ["nor(se)man"]).
>
>
> Sorry, but even the Gaels/Gauls were invaders :-)

If we go back far enough, I'm pretty sure the only true Englishman is
a sentient cup of tea.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Chris Angelico

On Wed, Jul 19, 2017 at 12:09 AM, Random832  wrote:
> On Fri, Jul 14, 2017, at 08:33, Chris Angelico wrote:
>> What do you mean about regular expressions? You can use REs with
>> normalized strings. And if you have any valid definition of "real
>> character", you can use it equally on an NFC-normalized or
>> NFD-normalized string than any other. They're just strings, you know.
>
> I don't understand how normalization is supposed to help with this. It's
> not like there aren't valid combinations that do not have a
> corresponding single NFC codepoint (to say nothing of the situation with
> e.g. Indic languages).
>
> In principle probably a viable solution for regex would be to add
> character classes for base and combining characters, and then
> "[[:base:]][[:combining:]]*" can be used as a building block if
> necessary.

Once you NFC or NFD normalize both strings, identical strings will
generally have identical codepoints. (There are some exceptions, and
for certain types of matching, you might want to use NFKC/NFKD
instead.) You should then be able to use normal regular expressions to
match correctly. I don't know of any situations where you want to
match "any base character" or "any combining character"; what you're
more likely to want is "match the letter á", and you don't care
whether it's represented as U+0061 U+0301 or as U+00E1. That's where
Unicode normalization comes in.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Marko Rauhamaa

Random832 :

> As for double-v, a quick search through /usr/share/dict/words reveals
> "civvies", "divvy", "revved/revving", "savvy" and "skivvy", and
> various conjugations thereof. All following, more or less, the rule of
> using a double consonant after a short vowel in contexts where a
> single consonant would suggest the preceding vowel was long.

The single/double consonant rule is indeed an ancient Germanic spelling
principle. English makes several twists to the it:

 * "v" is never doubled ("shovel")

 * a final "v" receives a superfluous "e" ("love")

 * the final consonant of a single-syllable word is doubled only if the
   consonant is "k", "l" or "s" ("kick", "kill", "kiss")

 * "k" becomes "ck" when doubled ("lacking")

 * a final consonant is never doubled in a multisyllable word
   ("havoc", "shovel")

 * a final "k" of a multisyllable word becomes "c" ("magic")


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Funding continuous maintenance for PyInstaller?

2017-07-18 Thread justin walters

On Tue, Jul 18, 2017 at 3:01 AM, Hartmut Goebel <
h.goe...@crazy-compilers.com> wrote:

> Hi,
>
> I'm seeking advice how to fund continuous maintenance for an open source
> project.
>
> *Do you have any idea how to fund continuous maintenance for PyInstaller?
> Do you have any idea whom or where to ask?
> Do you know somebody to help setting up a commercial support/maintenance
> model? *
>
> I'm the (remaining) maintainer of PyInstaller (www.pyinstaller,org).
> Currently I'm maintaining PyInstaller in my spare-time. But it's is
> getting to much work for working on it for free: the open issue tickets
> and pull-requests are piling up. Since PyInstaller is quite mature,
> problems are hard to track down and to solve. Thus solving one ticket
> often takes half a day or even more.
>
> I'm already got in tough with the the PSF and the Python Software
> Verband (much like PSF, just for Germany), but they have not experience
> with this. I also read the PSF grants program, but this doesn't fit for
> continous maintenance. I also had a look at bountysource, but the
> numbers offered there may be teasers for students, not for professionals
> - so I did not follow this road. I plan to add a "donate" page to the
> web-site, but I doubt this will bring in noteworthy amounts.
>
> So I was thinking about some commercial support/maintenance model, but I
> have no experience with this. As I'm a freelance consultant already (but
> in the information security business), this could be feasible to
> implement, if I'd know how to address the commercial users.
>
> Thanks for any tip!
>
> *About PyInstaller*
>
> PyInstaller is the successor of "McMillan Installer", a tool like,
> freeze, py2exe, py2app or bbfreeze - but PyInstaller supports Windows,
> MacOS and Unix (GN/Linux, Solaris, HP-UX, etc.). PyInstaller is widely
> used sa you can see when looking at the issues and on the mailinglist.
> E.g. kivy uses/recommends PyInstaller for building Python-Apps for
> mobile platforms.
>
> PyInstaller is also used for commercial applications (as some hints on
> the mailinglist or
> But most commercial users are unknown.
>
> *About me*
>
> I'm based an Germany and developing open source and free software since
> about 1990 and using Python since about 1998. Beside of this I developed
> software like pdfposter, python-ghostscript, python-managesieve, etc. My
> day-job is freelance consultant focused on information security.
>
> --
> Regards
> Hartmut Goebel
>
> | Hartmut Goebel  | h.goe...@crazy-compilers.com   |
> | www.crazy-compilers.com | compilers which you thought are impossible |
>
> --
> https://mail.python.org/mailman/listinfo/python-list
>

You could try reaching out to Michael a Talk Python to Me:
https://talkpython.fm/

He may be able to give you a mention on the show or even have you on as a
guest. He may also be able
to point you in the direction of some sponsors.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Rhodri James


On 18/07/17 16:27, Dennis Lee Bieber wrote:

On Tue, 18 Jul 2017 10:38:48 -0400, Random832 
declaimed the following:


Define "native" then. My interpretation of "native English words" is
"anything you wouldn't have to put in italics to use in a sentence".
Which would also include "continuum".



Probably would have to go to words predating the Roman occupation
(which probably means a dialect closer to Welsh or other Gaelic).
Everything later is an import (anglo-saxon being germanic tribes invading
south, Vikings in the central area, as I recall southern Irish displacing
Picts in Scotland, and then the Norman French (themselves starting from
Vikings ["nor(se)man"]).


Sorry, but even the Gaels/Gauls were invaders :-)

--
Rhodri James *-* Kynesim Ltd
--
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Rhodri James


On 18/07/17 15:10, Rustom Mody wrote:

On Monday, July 17, 2017 at 10:14:00 PM UTC+5:30, Rhodri James wrote:

On 17/07/17 05:10, Rustom Mody wrote:

Hint1: Ask your grandmother whether unicode's notion of character makes sense.
Ask 10 gmas from 10 language-L's
Hint2: When in doubt gma usually is right


"For every complex problem there is an answer that is clear, simple and
wrong." (H.L. Mencken).


Great men galore with great quotes galore²

[snip]

Unfortunately grandmothers outside their areas of expertise are particularly 
prone to finding those answers.


Gma for the purposes of this discussion can be defined:

- A (not necessarily) elderly person who
- Is fairly intelligent
- Not necessarily highly educated
- Generally interested in life and people
- [But not usually] in technical arcana


That last one is the killer.  Using clear and simple terminology is 
usually adequate when you aren't discussing technical arcana. 
Unfortunately we are discussing technical arcana, and that's when you 
trip over the fact that your clear, simple terminology is wrong.  It's 
an instance of Weizenbaum's joke that you quoted, just replacing 
streetlights with grandmas.


(For the record, one of my grandmothers would have been baffled by this 
conversation, and the other one would have had definite opinions on 
whether accents were distinct characters or not, followed by a 
digression into whether "ŵ" and "ŷ" should be suppressed vigorously :-)


--
Rhodri James *-* Kynesim Ltd
--
https://mail.python.org/mailman/listinfo/python-list

Re: cPickle fails on manually compiled and executed Python function

2017-07-18 Thread Jan Gosmann


On 07/18/2017 01:07 AM, dieter wrote:

"Jan Gosmann"  writes:


[...]
fn = load_pyfile('fn.py')['fn']
[...]

"pickle" (and "cpickle") are serializing functions as so called
"global"s, i.e. as a module reference together with a name.
This means, they cannot handle functions computed in a module
(as in your case).
Note that I'm assigning the computed function to a global/module level 
variable. As far as I understand the documentation 
 
that should be all that matters because only the function name will be 
serialized.

I am quite convinced that "pickle" will not be able to deserialize (i.e. load)
your function (even though it appears to perform the serialization
(i.e. dump).
Actually the deserialization works fine with either module. That is both 
pickle.loads(pickle.dumps(fn)) and cPickle.loads(pickle.dumps(fn)) give 
me back the function.


By now I realized that a pretty simple workaround works. Instead of 
doing `fn = load_pyfile('fn.py')['fn']` the following function 
definition works with both pickle modules:


_fn = load_pyfile('fn.py')['fn']
def fn(*args, **kwargs):
return _fn(*args, **kwargs)

--
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Grant Edwards

On 2017-07-18, Steve D'Aprano  wrote:

> (I don't think any native English words use a double-V or double-U, but the
> possibility exists.)

double-v:

   flivver, navvy, bivvy, bevvy, trivvet, divvy, skivvy, skivvies,
   etc.  and various gerund and past tense verbs: revved, revving,
   chivved chivving

double-u:

   vacuum, continuum, squush, fortuuned

> That's neither better nor worse than the system used by English and French,
> where letters with dicritics are not distinct letters, but guides to
> pronunciation.  Neither system is right or wrong, or better than the other.

You'll get kicked off Usenet for having an attitude like that!

-- 
Grant Edwards   grant.b.edwardsYow! I put aside my copy
  at   of "BOWLING WORLD" and
  gmail.comthink about GUN CONTROL
   legislation...

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Random832

On Tue, Jul 18, 2017, at 10:23, Anders Wegge Keller wrote:
> På Tue, 18 Jul 2017 23:59:33 +1000
> Chris Angelico  skrev:
> > On Tue, Jul 18, 2017 at 11:11 PM, Steve D'Aprano
> >> (I don't think any native English words use a double-V or double-U, but
> >> the possibility exists.)  
>  
> > vacuum.
> 
>  That's latin. 

Define "native" then. My interpretation of "native English words" is
"anything you wouldn't have to put in italics to use in a sentence".
Which would also include "continuum".

As for double-v, a quick search through /usr/share/dict/words reveals
"civvies", "divvy", "revved/revving", "savvy" and "skivvy", and various
conjugations thereof. All following, more or less, the rule of using a
double consonant after a short vowel in contexts where a single
consonant would suggest the preceding vowel was long.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Random832

On Sun, Jul 16, 2017, at 01:37, Steven D'Aprano wrote:
> In a *well-designed* *bug-free* monospaced font, all code points should 
> be either zero-width or one column wide. Or two columns, if the font 
> supports East Asian fullwidth characters.

What about Emoji?
U+1F469 WOMAN is two columns wide on its own.
U+1F4BB PERSONAL COMPUTER is two columns wide on its own.
U+200D ZERO WIDTH JOINER is zero columns wide on its own.

The sequence U+1F469 U+200D U+1F4BB is the single emoji "Woman
Technologist", which is two columns wide.

Even without ZWJ this comes up - the regional indicator characters are
meant to appear in pairs - signifying a flag, which is two columns wide
- but when they appear in isolation they usually appear as an equally
wide "letter in a box" picture.

The skin tone indicators aren't applied with ZWJ, and are meant to
combine with the preceding character when it is an emoji depicting a
person, but show up as a square swatch of that color in isolation. And
AIUI they don't have a combining class in the unicode data.

Or, consider presentation variation selectors

U+26A1 HIGH VOLTAGE SIGN
U+FE0E VARIATION SELECTOR-15 (text presentation in this context)
U+FE0F VARIATION SELECTOR-16 (emoji presentation in this context)

Some code points are meant to be shown as a text character in some
contexts and an emoji in others. The default presentation (when not
followed by a variation selector) depends on the application. Otherwise,
the Emoji is two columns wide and the text presentation version is
usually one column wide.

The variation selectors themselves are zero columns wide when applied to
any character for which it is not meant to be applied.

(From a font perspective these can be regarded as ligatures, but the
font itself is not responsible for the behavior of a character-cell
terminal emulator)
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Anders Wegge Keller

På Tue, 18 Jul 2017 23:59:33 +1000
Chris Angelico  skrev:
> On Tue, Jul 18, 2017 at 11:11 PM, Steve D'Aprano


>> (I don't think any native English words use a double-V or double-U, but
>> the possibility exists.)  
 
> vacuum.

 That's latin. 

-- 
//Wegge
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Combining every pair of list items and creating a new list.

2017-07-18 Thread ast



 a écrit dans le message de 
news:621ca9d5-79b1-44c9-b534-3ad1b0cf4...@googlegroups.com...

Hi,

I'm having difficulty thinking about how to do this as a Python beginner.

But I have a list that is represented as:

[1,2,3,4,5,6,7,8]

and I would like the following results:

[1,2] [3,4] [5,6] [7,8]

Any ideas?

Thanks


list(zip(L[0::2], L[1::2]))
[(1, 2), (3, 4), (5, 6), (7, 8)]


list(map(list, zip(L[0::2], L[1::2])))

[[1, 2], [3, 4], [5, 6], [7, 8]]

--
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Rustom Mody

On Monday, July 17, 2017 at 10:14:00 PM UTC+5:30, Rhodri James wrote:
> On 17/07/17 05:10, Rustom Mody wrote:
> > Hint1: Ask your grandmother whether unicode's notion of character makes 
> > sense.
> > Ask 10 gmas from 10 language-L's
> > Hint2: When in doubt gma usually is right
> 
> "For every complex problem there is an answer that is clear, simple and 
> wrong." (H.L. Mencken).  

Great men galore with great quotes galore²
Here are 3 — take your pick:

Einstein:
If you can't explain something to a six-year-old, you really don't understand 
it yourself.

[Commonly attributed to Einstein
More likely Feynman, Rutherford, de Broglie or some other notable physicist
https://skeptics.stackexchange.com/questions/8742/did-einstein-say-if-you-cant-explain-it-simply-you-dont-understand-it-well-en
]

Dijkstra: 

Programming languages belong to the problem set, not (as some imagine)
to the solution set
https://www.cs.utexas.edu/users/EWD/transcriptions/EWD04xx/EWD473.html

Joseph Weizenbaum — AI pioneer, author of Eliza:

Computer technology, like all sciences, are self-validating systems. They 
define problems and their solutions within a circumscribed context and leave 
out much of the real-world data. “Science can only proceed by simplifying 
reality.”

[Weizenbaum then recounts] a joke about a drunkard to clarify this statement: 
One dark evening a policeman comes across a man on his hands and knees 
searching beneath a lamppost. He asks the man what he’s doing and the man 
replies that he lost his keys over there, pointing off into the darkness. “So 
why are you looking for them under the streetlight?” inquired the policeman. 
The man replies, “Because the light is so much better over here.”

http://www.digitalathena.com/the-wisdom-of-joseph-weizenbaum.html

> Unfortunately grandmothers outside their areas of expertise are particularly 
> prone to finding those answers.

Gma for the purposes of this discussion can be defined:

- A (not necessarily) elderly person who
- Is fairly intelligent
- Not necessarily highly educated
- Generally interested in life and people
- [But not usually] in technical arcana

An alternative "definition to gma" (if big names are a requirement) could be 
Joseph Weizenbaum quoted above, who in Computer Power and Human Reason
vociferously spoke against the propensity to define human value in terms of
"computerizability"
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Random832

On Fri, Jul 14, 2017, at 04:15, Marko Rauhamaa wrote:
>  Consider, for example, a Python source code
> editor where you want to limit the length of the line based on the
> number of characters more typically than based on the number of pixels.

Even there you need to go based on the width in character cells. Most
characters for East Asian languages occupy two character cells.

It would be nice if there was an easy way to get str.format to use this
width instead of the length in code points for the purpose of padding.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Random832

On Fri, Jul 14, 2017, at 08:33, Chris Angelico wrote:
> What do you mean about regular expressions? You can use REs with
> normalized strings. And if you have any valid definition of "real
> character", you can use it equally on an NFC-normalized or
> NFD-normalized string than any other. They're just strings, you know.

I don't understand how normalization is supposed to help with this. It's
not like there aren't valid combinations that do not have a
corresponding single NFC codepoint (to say nothing of the situation with
e.g. Indic languages).

In principle probably a viable solution for regex would be to add
character classes for base and combining characters, and then
"[[:base:]][[:combining:]]*" can be used as a building block if
necessary.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Chris Angelico

On Tue, Jul 18, 2017 at 11:11 PM, Steve D'Aprano
 wrote:
> On Tue, 18 Jul 2017 08:01 am, Mikhail V wrote:
>
>> And just in case still its not clear: this is not
>> solved by adding dirt around the letter: if there is
>> enough significance of the phoneme distinction then
>> one should add a distinct letter for a syntax in question.
>
> It isn't "dirt", any more than difference between Ш (SHA) and Щ (SHCHA)
> is "dirt", or between F and E is "dirt".
>
> In Swedish, Å, Ä, and Ö are distinct letters of the alphabet. In Danish and
> Norwegian, Æ Ø and Å are distinct letters of the alphabet. Just as in English 
> W
> is a distinct letter of the alphabet, different from either VV or UU.
>
> (I don't think any native English words use a double-V or double-U, but the
> possibility exists.)

vacuum.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-18 Thread Steve D'Aprano

On Tue, 18 Jul 2017 08:01 am, Mikhail V wrote:

> And just in case still its not clear: this is not
> solved by adding dirt around the letter: if there is
> enough significance of the phoneme distinction then
> one should add a distinct letter for a syntax in question.

It isn't "dirt", any more than difference between Ш (SHA) and Щ (SHCHA)
is "dirt", or between F and E is "dirt".

In Swedish, Å, Ä, and Ö are distinct letters of the alphabet. In Danish and
Norwegian, Æ Ø and Å are distinct letters of the alphabet. Just as in English W
is a distinct letter of the alphabet, different from either VV or UU.

(I don't think any native English words use a double-V or double-U, but the
possibility exists.)

That's neither better nor worse than the system used by English and French,
where letters with dicritics are not distinct letters, but guides to
pronunciation.  Neither system is right or wrong, or better than the other.

-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Users of namedtuple: do you use the _source attribute?

2017-07-18 Thread Steve D'Aprano

On Tue, 18 Jul 2017 03:58 pm, Terry Reedy wrote:

>> On Monday, July 17, 2017 at 12:20:04 PM UTC-5, Steve D'Aprano wrote:
>>> collections.namedtuple generates a new class using exec,
>>> and records the source code for the class as a _source
>>> attribute.  Although it has a leading underscore, it is
>>> actually a public attribute. The leading underscore
>>> distinguishes it from a named field potentially called
>>> "source", e.g. namedtuple("klass", ['source',
>>> 'destination']).

[...]

> Yes, No.  The developers of the class agree that a trailing underscore
> convention would have been better.  'source_' etc.

I actually disagree with Raymond, and I think his first instinct was the correct
one.

"source_" is already a public name, which means that users could want to create
fields with that name for some reason, just as they could create "source_code"
or "source_be_with_you" or any other name containing underscores. There is no
restriction on names ending in an underscore, and we have a convention to use
such names when they would otherwise clash with a keyword, e.g. "class_".

So I don't think that namedtuple should reserve names ending with underscore for
its own use. I think that Raymond's first decision was correct, and documenting
_source as public is the least-worst option.

[1] Maybe if we borrowed the keys to Guido's Time Machine and went back to
Python 0.9 we could argue that there should be. "Dunder names and names ending
in a single underscore are reserved for Python." But that would clash 

-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Funding continuous maintenance for PyInstaller?

2017-07-18 Thread Rhodri James


On 18/07/17 11:01, Hartmut Goebel wrote:

Hi,

I'm seeking advice how to fund continuous maintenance for an open source
project.

*Do you have any idea how to fund continuous maintenance for PyInstaller?
Do you have any idea whom or where to ask?
Do you know somebody to help setting up a commercial support/maintenance
model? *


Try the Linux Foundation, https://www.linuxfoundation.org/  Maintenance 
of Open Source projects is one of the things they are interested in 
helping with.


Disclaimer: I am currently under contract to them to provide support for 
a mature open source project, so I'm a tad biased.


--
Rhodri James *-* Kynesim Ltd
--
https://mail.python.org/mailman/listinfo/python-list

How Can I edit and update my .config (for my python application) file using WebSockets exactly like how we edit and update router .config file?

2017-07-18 Thread T Obulesu

I have my python application running on Raspberry Pi and it needs to be 
configured every time. Hence I want to access this .config file over online and 
configure it exactly like how we can configure our router, but I want to use 
only web sockets.
-- 
https://mail.python.org/mailman/listinfo/python-list

Funding continuous maintenance for PyInstaller?

2017-07-18 Thread Hartmut Goebel

Hi,

I'm seeking advice how to fund continuous maintenance for an open source
project.

*Do you have any idea how to fund continuous maintenance for PyInstaller?
Do you have any idea whom or where to ask?
Do you know somebody to help setting up a commercial support/maintenance
model? *

I'm the (remaining) maintainer of PyInstaller (www.pyinstaller,org).
Currently I'm maintaining PyInstaller in my spare-time. But it's is
getting to much work for working on it for free: the open issue tickets
and pull-requests are piling up. Since PyInstaller is quite mature,
problems are hard to track down and to solve. Thus solving one ticket
often takes half a day or even more.

I'm already got in tough with the the PSF and the Python Software
Verband (much like PSF, just for Germany), but they have not experience
with this. I also read the PSF grants program, but this doesn't fit for
continous maintenance. I also had a look at bountysource, but the
numbers offered there may be teasers for students, not for professionals
- so I did not follow this road. I plan to add a "donate" page to the
web-site, but I doubt this will bring in noteworthy amounts.

So I was thinking about some commercial support/maintenance model, but I
have no experience with this. As I'm a freelance consultant already (but
in the information security business), this could be feasible to
implement, if I'd know how to address the commercial users.

Thanks for any tip!

*About PyInstaller*

PyInstaller is the successor of "McMillan Installer", a tool like,
freeze, py2exe, py2app or bbfreeze - but PyInstaller supports Windows,
MacOS and Unix (GN/Linux, Solaris, HP-UX, etc.). PyInstaller is widely
used sa you can see when looking at the issues and on the mailinglist.
E.g. kivy uses/recommends PyInstaller for building Python-Apps for
mobile platforms.

PyInstaller is also used for commercial applications (as some hints on
the mailinglist or
But most commercial users are unknown.

*About me*

I'm based an Germany and developing open source and free software since
about 1990 and using Python since about 1998. Beside of this I developed
software like pdfposter, python-ghostscript, python-managesieve, etc. My
day-job is freelance consultant focused on information security.

-- 
Regards
Hartmut Goebel

| Hartmut Goebel  | h.goe...@crazy-compilers.com   |
| www.crazy-compilers.com | compilers which you thought are impossible |

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Combining every pair of list items and creating a new list.

2017-07-18 Thread Rahul K P

You can use a simple logic and list comprehension.

so it will be like this

lst = [1, 2, 3, 4, 5, 6, 7, 8]
print [lst[i:i+2] for i in range(0,len(lst),2)]

Here 2 is the pairing number, You can set is as your need.



On Tue, Jul 18, 2017 at 1:40 AM,  wrote:

> Hi,
>
> I'm having difficulty thinking about how to do this as a Python beginner.
>
> But I have a list that is represented as:
>
> [1,2,3,4,5,6,7,8]
>
> and I would like the following results:
>
> [1,2] [3,4] [5,6] [7,8]
>
> Any ideas?
>
> Thanks
> --
> https://mail.python.org/mailman/listinfo/python-list
>



-- 
Regards
*Rahul K P*

Python Developer
Mumbai
+919895980223
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Users of namedtuple: do you use the _source attribute?

2017-07-18 Thread Cameron Simpson


On 18Jul2017 02:57, Steve D'Aprano  wrote:

collections.namedtuple generates a new class using exec, and records the source
code for the class as a _source attribute.

Although it has a leading underscore, it is actually a public attribute. The
leading underscore distinguishes it from a named field potentially
called "source", e.g. namedtuple("klass", ['source', 'destination']).

There is some discussion on Python-Dev about:
- changing the way the namedtuple class is generated which may
 change the _source attribute
- or even dropping it altogether
in order to speed up namedtuple and reduce Python's startup time.

Is there anyone here who uses the namedtuple _source attribute?


Speaking for myself: no I do not.

Cheers,
Cameron Simpson 
--
https://mail.python.org/mailman/listinfo/python-list

60 matches

Mail list logo