Re: python math problem

2013-02-15 Thread John Machin
On Feb 16, 6:39 am, Kene Meniru kene.men...@illom.org wrote: x = (math.sin(math.radians(angle)) * length) y = (math.cos(math.radians(angle)) * length) A suggestion about coding style: from math import sin, cos, radians # etc etc x = sin(radians(angle)) * length y = cos(radians(angle)) *

[issue13899] re pattern r[\A] should work like A but matches nothing. Ditto B and Z.

2012-01-29 Thread John Machin
John Machin sjmac...@lexicon.net added the comment: @Ezio: Comparison of the behaviour of \letter inside/outside character classes is irrelevant. The rules for inside can be expressed simply as: 1. Letters dDsSwW are special; they represent categories as documented, and do in fact have

[issue13899] re pattern r[\A] should work like A but matches nothing. Ditto B and Z.

2012-01-29 Thread John Machin
John Machin sjmac...@lexicon.net added the comment: Whoops: normal Python rules for backslash escapes should have had a note but revert to the C behaviour of stripping the \ from unrecognised escapes which is what re appears to do in its own \ handling

[issue13899] re pattern r[\A] should work like A but matches nothing. Ditto B and Z.

2012-01-28 Thread John Machin
New submission from John Machin sjmac...@lexicon.net: Expected behaviour illustrated using C: import re re.findall(r'[\C]', 'CCC') ['C', 'C', 'C'] re.compile(r'[\C]', 128) literal 67 _sre.SRE_Pattern object at 0x01FC6E78 re.compile(r'C', 128) literal 67 _sre.SRE_Pattern object at 0x01FC6F08

[issue13899] re pattern r[\A] should work like A but matches nothing. Ditto B and Z.

2012-01-28 Thread John Machin
John Machin sjmac...@lexicon.net added the comment: @ezio: Of course the context is inside a character class. I expect r'[\b]' to act like r'\b' aka r'\x08' aka backspace because (1) that is the treatment applied to all other C-like control char escapes (2) the docs say so explicitly: Inside

[issue13782] xml.etree.ElementTree: Element.append doesn't type-check its argument

2012-01-13 Thread John Machin
New submission from John Machin sjmac...@lexicon.net: import xml.etree.ElementTree as et node = et.Element('x') node.append(not_an_Element_instance) 2.7 and 3.2 produce no complaint at all. 2.6 and 3.1 produce an AssertionError. However cElementTree in all 4 versions produces a TypeError

Re: unicode by default

2011-05-12 Thread John Machin
On Thu, May 12, 2011 4:31 pm, harrismh777 wrote: So, the UTF-16 UTF-32 is INTERNAL only, for Python NO. See one of my previous messages. UTF-16 and UTF-32, like UTF-8 are encodings for the EXTERNAL representation of Unicode characters in byte streams. I also was not aware that UTF-8 chars

Re: unicode by default

2011-05-11 Thread John Machin
On Thu, May 12, 2011 8:51 am, harrismh777 wrote: Is it true that if I am working without using bytes sequences that I will not need to care about the encoding anyway, unless of course I need to specify a unicode code point? Quite the contrary. (1) You cannot work without using bytes

Re: urllib2 request with binary file as payload

2011-05-11 Thread John Machin
On Thu, May 12, 2011 10:20 am, Michiel Sikma wrote: Hi there, I made a small script implementing a part of Youtube's API that allows you to upload videos. It's pretty straightforward and uses urllib2. The script was written for Python 2.6, but the server I'm going to use it on only has 2.5

Re: unicode by default

2011-05-11 Thread John Machin
On Thu, May 12, 2011 11:22 am, harrismh777 wrote: John Machin wrote: (1) You cannot work without using bytes sequences. Files are byte sequences. Web communication is in bytes. You need to (know / assume / be able to extract / guess) the input encoding. You need to encode your output using

Re: unicode by default

2011-05-11 Thread John Machin
On Thu, May 12, 2011 1:44 pm, harrismh777 wrote: By default it looks like Python3 is writing output with UTF-8 as default... and I thought that by default Python3 was using either UTF-16 or UTF-32. So, I'm confused here... also, I used the character sequence \u00A3 which I thought was

Re: unicode by default

2011-05-11 Thread John Machin
On Thu, May 12, 2011 2:14 pm, Benjamin Kaplan wrote: If the file you're writing to doesn't specify an encoding, Python will default to locale.getdefaultencoding(), No such attribute. Perhaps you mean locale.getpreferredencoding() -- http://mail.python.org/mailman/listinfo/python-list

codecs.open() doesn't handle platform-specific line terminator

2011-05-09 Thread John Machin
According to the 3.2 docs (http://docs.python.org/py3k/library/codecs.html#codecs.open), Files are always opened in binary mode, even if no binary mode was specified. This is done to avoid data loss due to encodings using 8-bit values. This means that no automatic conversion of b'\n' is done on

Re: codec for UTF-8 with BOM

2011-05-02 Thread John Machin
On Monday, 2 May 2011 19:47:45 UTC+10, Chris Rebert wrote: On Mon, May 2, 2011 at 1:34 AM, Ulrich Eckhardt ulrich@dominolaser.com wrote: The correct name, as you found below and as is corroborated by the webpage, seems to be utf_8_sig: uFOøbar.encode('utf_8_sig')

Re: Snowball to Python compiler

2011-04-21 Thread John Machin
On Friday, April 22, 2011 8:05:37 AM UTC+10, Matt Chaput wrote: I'm looking for some code that will take a Snowball program and compile it into a Python script. Or, less ideally, a Snowball interpreter written in Python. (http://snowball.tartarus.org/) If anyone has done such things

[issue7198] Extraneous newlines with csv.writer on Windows

2011-03-19 Thread John Machin
John Machin sjmac...@lexicon.net added the comment: Can somebody please review my doc patch submitted 2 months ago? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7198

[issue7198] Extraneous newlines with csv.writer on Windows

2011-03-19 Thread John Machin
John Machin sjmac...@lexicon.net added the comment: Skip, The changes that I suggested have NOT been made. Please re-read the doc page you pointed to. The writer paragraph does NOT mention that newline='' is required when writing. The writer examples do NOT include newline=''. The examples

[issue10954] No warning for csv.writer API change

2011-03-19 Thread John Machin
John Machin sjmac...@lexicon.net added the comment: The doc patch proposed by Skip on 2001-01-24 for this bug has NOT been reviewed, let alone applied. Sibling bug #7198 has been closed in error. Somebody please help. -- nosy: +skip.montanaro

[issue10954] No warning for csv.writer API change

2011-03-19 Thread John Machin
John Machin sjmac...@lexicon.net added the comment: Terry, I have already made the point the docs bug is #7198. This is the meaningful-exception bug. My review is changing 'should' to 'must' is not very useful without a consistent interpretation of what those two words mean and without any

Re: getting text out of an xml string

2011-03-05 Thread John Machin
On Mar 5, 8:57 am, JT jeff.temp...@gmail.com wrote: On Mar 4, 9:30 pm, John Machin sjmac...@lexicon.net wrote: Your data has been FUABARred (the first A being for Almost) -- the \u3c00 and \u3e00 were once and respectively. You will Hi John,    I realized that a few minutes after

Re: getting text out of an xml string

2011-03-04 Thread John Machin
On Mar 5, 6:53 am, JT jeff.temp...@gmail.com wrote: Yo,  So I have almost convinced a small program to do what I want it to do.  One thing remains (at least, one thing I know of at the moment): I am converting xml to some other format, and there are strings in the xml like this. The

Re: 2to3 chokes on bad character

2011-02-24 Thread John Machin
On Feb 23, 7:47 pm, Frank Millman fr...@chagford.com wrote: Hi all I don't know if this counts as a bug in 2to3.py, but when I ran it on my program directory it crashed, with a traceback but without any indication of which file caused the problem. [traceback snipped] UnicodeDecodeError:

Re: 2to3 chokes on bad character

2011-02-24 Thread John Machin
On Feb 25, 12:00 am, Peter Otten __pete...@web.de wrote: John Machin wrote: Your Python 2.x code should be TESTED before you poke 2to3 at it. In this case just trying to run or import the offending code file would have given an informative syntax error (you have declared the .py file

Re: py3k: converting int to bytes

2011-02-24 Thread John Machin
On Feb 25, 4:39 am, Terry Reedy wrote: Note: an as yet undocumented feature of bytes (at least in Py3) is that bytes(count) == bytes()*count == b'\x00'*count. Python 3.1.3 docs for bytes() say same constructor args as for bytearray(); this says about the source parameter: If it is an integer,

[issue11204] re module: strange behaviour of space inside {m, n}

2011-02-12 Thread John Machin
New submission from John Machin sjmac...@lexicon.net: A pattern like rb{1,3}\Z matches b, bb, and bbb, as expected. There is no documentation of the behaviour of rb{1, 3}\Z -- it matches the LITERAL TEXT b{1, 3} in normal mode and b{1,3} in verbose mode. # paste the following

Re: python crash problem

2011-02-05 Thread John Machin
On Feb 3, 8:21 am, Terry Reedy tjre...@udel.edu wrote: On 2/2/2011 2:19 PM, Yelena wrote: . When having a problem with a 3rd party module, not part of the stdlib, you should give a source.    http://sourceforge.net/projects/dbfpy/ This appears to be a compiled extension. Nearly always, when

[issue10954] No warning for csv.writer API change

2011-01-23 Thread John Machin
John Machin sjmac...@lexicon.net added the comment: Skip, the docs bug is #7198. This is the meaningful-exception bug. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10954

[issue10954] No warning for csv.writer API change

2011-01-22 Thread John Machin
John Machin sjmac...@lexicon.net added the comment: I don't understand Changing csv api is a feature request that could only happen in 3.3. This is NOT a request for an API change. Lennert's point is that an API change was made in 3.0 as compared with 2.6 but there is no fixer in 2to3. What

[issue10954] No warning for csv.writer API change

2011-01-20 Thread John Machin
John Machin sjmac...@lexicon.net added the comment: I believe that both csv.reader and csv.writer should fail with a meaningful message if mode is binary or newline is not '' -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org

[issue7198] Extraneous newlines with csv.writer on Windows

2011-01-19 Thread John Machin
John Machin sjmac...@lexicon.net added the comment: docpatch for 3.x csv docs: In the csv.writer docs, insert the sentence If csvfile is a file object, it should be opened with newline=''. immediately after the sentence csvfile can be any object with a write() method. In the closely

Re: Interesting bug

2011-01-03 Thread John Machin
On Jan 2, 12:22 am, Daniel Fetchinson fetchin...@googlemail.com wrote: An AI bot is playing a trick on us. Yes, it appears that the mystery is solved: Mark V. Shaney is alive and well and living in Bangalore :-) -- http://mail.python.org/mailman/listinfo/python-list

[issue7198] Extraneous newlines with csv.writer on Windows

2010-12-26 Thread John Machin
John Machin sjmac...@users.sourceforge.net added the comment: Skip, I'm WRITING, not reading.. Please read the 3.1 documentation for csv.writer. It does NOT mention newline='', and neither does the example. Please fix. Other problems with the examples: (1) They encourage a bad habit (open

[issue7198] Extraneous newlines with csv.writer on Windows

2010-12-23 Thread John Machin
John Machin sjmac...@users.sourceforge.net added the comment: Please re-open this. The binary/text mode problem still exists with Python 3.X on Windows. Quite simply, there is no option available to the caller to open the output file in binary mode, because the module is throwing str objects

Re: Modifying an existing excel spreadsheet

2010-12-22 Thread John Machin
On Dec 21, 8:56 am, Ed Keith e_...@yahoo.com wrote: I have a user supplied 'template' Excel spreadsheet. I need to create a new excel spreadsheet based on the supplied template, with data filled in. I found the tools herehttp://www.python-excel.org/, 

Re: Ensuring symmetry in difflib.SequenceMatcher

2010-11-24 Thread John Machin
On Nov 24, 8:43 pm, Peter Otten __pete...@web.de wrote: John Yeung wrote: I'm generally pleased with difflib.SequenceMatcher:  It's probably not the best available string matcher out there, but it's in the standard library and I've seen worse in the wild.  One thing that kind of bothers

Re: Raw Unicode docstring

2010-11-16 Thread John Machin
On Nov 17, 9:34 am, Alexander Kapps alex.ka...@web.de wrote:   urScheißt\nderBär\nim Wald? Nicht ohne eine Genehmigung von der Umwelt Erhaltung Abteilung. -- http://mail.python.org/mailman/listinfo/python-list

Re: A bug for raw string literals in Py3k?

2010-10-31 Thread John Machin
On Oct 31, 11:23 pm, Yingjie Lan lany...@yahoo.com wrote: So I suppose this is a bug? It's not, see http://docs.python.org/py3k/reference/lexical_analysis.html#literals # Specifically, a raw string cannot end in a single backslash Thanks! That looks weird to me ... doesn't this

Re: Runtime error

2010-10-29 Thread John Machin
On Oct 29, 3:26 am, Sebastian python-maill...@elygor.de wrote: Hi all, I am new to python and I don't know how to fix this error. I only try to execute python (or a cgi script) and I get an ouptu like [...] 'import site' failed; traceback: Traceback (most recent call last): File

Re: Get alternative char name with unicodedata.name() if no formal one defined

2010-10-14 Thread John Machin
On Oct 14, 7:25 pm, Dirk Wallenstein hals...@t-online.de wrote: Hi, I'd like to get control char names for the first 32 codepoints, but they apparently only have an alias and no official name. Is there a way to get the alternative character name (alias) in Python? AFAIK there is no

Re: Wrong default endianess in utf-16 and utf-32 !?

2010-10-12 Thread John Machin
jmfauth wxjmfauth at gmail.com writes: When an endianess is not specified, (BE, LE, unmarked forms), the Unicode Consortium specifies, the default byte serialization should be big-endian. See http://www.unicode.org/faq//utf_bom.html Q: Which of the UTFs do I need to support? and Q: Why

cp936 uses gbk codec, doesn't decode `\x80` as U+20AC EURO SIGN

2010-10-10 Thread John Machin
| '\x80'.decode('cp936') Traceback (most recent call last): File stdin, line 1, in module UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 0: incomplete multibyte sequence However: Retrieved 2010-10-10 from http://www.unicode.org/Public

[issue9980] str(float) failure

2010-09-29 Thread John Machin
Changes by John Machin sjmac...@users.sourceforge.net: -- nosy: +sjmachin ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9980 ___ ___ Python-bugs

strange results from sys.version

2010-09-27 Thread John Machin
I am trying to help a user of my xlrd package who says he is getting anomalous results on his work computer but not on his home computer. Attempts to reproduce his alleged problem in a verifiable manner on his work computer have failed, so far ... the only meaning difference in script output

Re: Detect string has non-ASCII chars without checking each char?

2010-08-22 Thread John Machin
On Aug 22, 5:07 pm, Michel Claveau - MVPenleverlesx_xx...@xmclavxeaux.com.invalid wrote: Hi! Another way :   # -*- coding: utf-8 -*-   import unicodedata   def test_ascii(struni):       strasc=unicodedata.normalize('NFD', struni).encode('ascii','replace')       if

Re: Detect string has non-ASCII chars without checking each char?

2010-08-22 Thread John Machin
On Aug 23, 1:10 am, Michel Claveau - MVPenleverlesx_xx...@xmclavxeaux.com.invalid wrote: Re ! Try your code with uabcd\xa1 ... it says it's ASCII. Ah?  in my computer, it say False Perhaps your computer has a problem. Mine does this with both Python 2.7 and Python 2.3 (which introduced the

Re: re.sub and variables

2010-08-12 Thread John Machin
On Aug 13, 7:33 am, fuglyducky fuglydu...@gmail.com wrote: On Aug 12, 2:06 pm, fuglyducky fuglydu...@gmail.com wrote: I have a function that I am attempting to call from another file. I am attempting to replace a string using re.sub with another string. The problem is that the second

Re: Ascii to Unicode.

2010-07-30 Thread John Machin
On Jul 30, 4:18 am, Carey Tilden carey.til...@gmail.com wrote: In this case, you've been able to determine the correct encoding (latin-1) for those errant bytes, so the file itself is thus known to be in that encoding. The most probably correct encoding is, as already stated, and agreed by the

Re: Where is the help page for re.MatchObject?

2010-07-28 Thread John Machin
On Jul 28, 1:26 pm, Peng Yu pengyu...@gmail.com wrote: I know the library reference webpage for re.MatchObject is athttp://docs.python.org/library/re.html#re.MatchObject But I don't find such a help page in python help(). Does anybody know how to get it in help()? Yes, but it doesn't tell

Re: Ascii to Unicode.

2010-07-28 Thread John Machin
On Jul 29, 4:32 am, Joe Goldthwaite j...@goldthwaites.com wrote: Hi, I've got an Ascii file with some latin characters. Specifically \xe1 and \xfc.  I'm trying to import it into a Postgresql database that's running in Unicode mode. The Unicode converter chokes on those two characters. I

Re: newb

2010-07-27 Thread John Machin
On Jul 27, 9:07 pm, whitey m...@here.com wrote: hi all. am totally new to python and was wondering if there are any newsgroups that are there specifically for beginners. i have bought a book for $2 called learn to program using python by alan gauld. starting to read it but it was written in

Re: Unicode error

2010-07-24 Thread John Machin
dirknbr dirknbr at gmail.com writes: I have kind of developped this but obviously it's not nice, any better ideas? try: text=texts[i] text=text.encode('latin-1') text=text.encode('utf-8') except: text=' ' As Steven has

Re: SyntaxError not honoured in list comprehension?

2010-07-04 Thread John Machin
On Jul 5, 1:08 am, Thomas Jollans tho...@jollans.com wrote: On 07/04/2010 03:49 PM, jmfauth wrote:   File psi last command, line 1     print9.0            ^ SyntaxError: invalid syntax somewhat strange, yes. There are two tokens, print9 (a name) and .0 (a float constant) -- looks like

Re: Python 2.7 released

2010-07-04 Thread John Machin
On Jul 5, 12:27 pm, Martineau ggrp2.20.martin...@dfgh.net wrote: On Jul 4, 8:34 am, Benjamin Peterson benja...@python.org wrote: On behalf of the Python development team, I'm jocund to announce the second release candidate of Python 2.7. Python 2.7 will be the last major version in the

[issue8271] str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0

2010-07-03 Thread John Machin
John Machin sjmac...@users.sourceforge.net added the comment: About the E0 80 81 61 problem: my interpretation is that you are correct, the 80 is not valid in the current state (start byte == E0), so no look-ahead, three FFFDs must be issued followed by 0061. I don't really care about issuing

Re: escape character / csv module

2010-07-02 Thread John Machin
On Jul 2, 6:04 am, MRAB pyt...@mrabarnett.plus.com wrote: The csv module imports from _csv, which suggests to me that there's code written in C which thinks that the \x00 is a NUL terminator, so it's a bug, although it's very unusual to want to write characters like \x00 to a CSV file, and I

Re: Handling text lines from files with some (few) starnge chars

2010-06-05 Thread John Machin
On Jun 6, 12:14 pm, MRAB pyt...@mrabarnett.plus.com wrote: Paulo da Silva wrote: Em 06-06-2010 00:41, Chris Rebert escreveu: On Sat, Jun 5, 2010 at 4:03 PM, Paulo da Silva psdasilva.nos...@netcabonospam.pt wrote: ... Specify the encoding of the text when opening the file using the

Re: signed vs unsigned int

2010-06-02 Thread John Machin
On Jun 2, 4:43 pm, johnty johntyw...@gmail.com wrote: i'm reading bytes from a serial port, and storing it into an array. each byte represents a signed 8-bit int. currently, the code i'm looking at converts them to an unsigned int by doing ord(array[i]). however, what i'd like is to get the

Re: expat parsing error

2010-06-01 Thread John Machin
On Jun 2, 1:57 am, kak...@gmail.com kak...@gmail.com wrote: On Jun 1, 11:12 am, kak...@gmail.com kak...@gmail.com wrote: On Jun 1, 11:09 am, John Bokma j...@castleamber.com wrote: kak...@gmail.com kak...@gmail.com writes: On Jun 1, 10:34 am, Stefan Behnel stefan...@behnel.de wrote:

Re: Help with Regexp, \b

2010-05-31 Thread John Machin
On May 30, 1:30 am, andrew cooke and...@acooke.org wrote: That's what I thought it did...  Then I read the docs and confused empty string with space(!) and convinced myself otherwise.  I think I am going senile. Not necessarily. Conflating concepts like string containing whitespace, string

Re: UnicodeDecodeError having fetch web page

2010-05-26 Thread John Machin
Rob Williscroft rtw at rtw.me.uk writes: Barry wrote in news:83dc485a-5a20-403b-99ee-c8c627bdbab3 @m21g2000vbr.googlegroups.com in gmane.comp.python.general: UnicodeDecodeError: 'utf8' codec can't decode byte 0x8b in position 1: unexpected code byte It may not be you,

Re: help need to write a python spell checker

2010-05-18 Thread John Machin
On May 19, 1:37 pm, Steven D'Aprano steve-REMOVE- t...@cybersource.com.au wrote: On Wed, 19 May 2010 13:01:10 +1000, Nigel Rowe wrote: I'm happy to do you homework for you, cost is us$1000 per hour.  Email to your professor automatically on receipt. I'll do it for $700 an hour! he could

Re: Puzzled by code pages

2010-05-15 Thread John Machin
Adam Tauno Williams awilliam at whitemice.org writes: On Fri, 2010-05-14 at 20:27 -0400, Adam Tauno Williams wrote: I'm trying to process OpenStep plist files in Python. I have a parser which works, but only for strict ASCII. However plist files may contain accented characters -

Re: Fastest way to calculate leading whitespace

2010-05-09 Thread John Machin
dasacc22 dasacc22 at gmail.com writes: U presume entirely to much. I have a preprocessor that normalizes documents while performing other more complex operations. Theres nothing buggy about what im doing Are you sure? Your solution calculates (the number of leading whitespace characters)

Re: How to get xml.etree.ElementTree not bomb on invalid characters in XML file ?

2010-05-04 Thread John Machin
On May 5, 12:11 am, Barak, Ron ron.ba...@lsi.com wrote: -Original Message- From: Stefan Behnel [mailto:stefan...@behnel.de] Sent: Tuesday, May 04, 2010 10:24 AM To: python-l...@python.org Subject: Re: How to get xml.etree.ElementTree not bomb on invalid characters in XML file ?

Re: How to get xml.etree.ElementTree not bomb on invalid characters in XML file ?

2010-05-04 Thread John Machin
On May 5, 3:43 am, Terry Reedy tjre...@udel.edu wrote: On 5/4/2010 11:37 AM, Stefan Behnel wrote: Barak, Ron, 04.05.2010 16:11: The XML file seems to be valid XML (all XML viewers I tried were able to read it).  From Internet Explorer: The XML page cannot be displayed Cannot view XML

Re: condition and True or False

2010-05-02 Thread John Machin
On May 3, 9:14 am, Steven D'Aprano st...@remove-this- cybersource.com.au wrote: If it is any arbitrary object, then x and True or False is just an obfuscated way of writing bool(x). Perhaps their code predates the introduction of bools, and they have defined global constants True and False

Re: csv.py sucks for Decimal

2010-04-25 Thread John Machin
On Apr 23, 9:23 am, Phlip phlip2...@gmail.com wrote: When I use the CSV library, with QUOTE_NONNUMERIC, and when I pass in a Decimal() object, I must convert it to a string. Why must you? What unwanted effect do you observe when you don't convert it? the search for an alternate CSV module,

[issue8308] raw_bytes.decode('cp932') -- spurious mappings

2010-04-04 Thread John Machin
John Machin sjmac...@users.sourceforge.net added the comment: Thanks, Martin. Issue closed as far as I'm concerned. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue8308

[issue8308] raw_bytes.decode('cp932') -- spurious mappings

2010-04-03 Thread John Machin
New submission from John Machin sjmac...@users.sourceforge.net: According to the following references, the bytes 80, A0, FD, FE, and FF are not defined in cp932: http://msdn.microsoft.com/en-au/goglobal/cc305152.aspx http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP932.TXT http

[issue8271] str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0

2010-04-01 Thread John Machin
John Machin sjmac...@users.sourceforge.net added the comment: @ezio.melotti: Your second sentence is true, but it is not the whole truth. Bytes in the range C0-FF (whose high bit *is* set) ALSO shouldn't be considered part of the sequence because they (like 00-7F) are invalid as continuation

[issue8271] str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0

2010-04-01 Thread John Machin
John Machin sjmac...@users.sourceforge.net added the comment: #ezio.melotti: I'm considering valid all the bytes that start with '10...' Sorry, WRONG. Read what I wrote: Further, some bytes in the range 80-BF are NOT always valid as the first continuation byte, it depends on what starter byte

[issue8271] str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0

2010-04-01 Thread John Machin
John Machin sjmac...@users.sourceforge.net added the comment: Unicode has been frozen at 0x10. That's it. There is no such thing as a valid 5-byte or 6-byte UTF-8 string. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org

[issue8271] str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0

2010-04-01 Thread John Machin
John Machin sjmac...@users.sourceforge.net added the comment: @lemburg: RFC 2279 was obsoleted by RFC 3629 over 6 years ago. The standard now says 21 bits is it. F5-FF are declared to be invalid. I don't understand what you mean by supporting those possibilities. The code is correctly issuing

[issue8271] str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0

2010-04-01 Thread John Machin
John Machin sjmac...@users.sourceforge.net added the comment: Patch review: Preamble: pardon my ignorance of how the codebase works, but trunk unicodeobject.c is r79494 (and allows encoding of surrogate codepoints), py3k unicodeobject.c is r79506 (and bans the surrogate caper) and I can't

[issue8271] str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0

2010-04-01 Thread John Machin
John Machin sjmac...@users.sourceforge.net added the comment: Chapter 3, page 94: As a consequence of the well-formedness conditions specified in Table 3-7, the following byte values are disallowed in UTF-8: C0–C1, F5–FF Of course they should be handled by the simple expedient of setting

[issue8271] str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0

2010-04-01 Thread John Machin
John Machin sjmac...@users.sourceforge.net added the comment: @lemburg: perhaps applying the same logic as for the other sequences is a better strategy What other sequences??? F5-FF are invalid bytes; they don't start valid sequences. What same logic?? At the start of a character, they should

[issue8271] str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0

2010-03-31 Thread John Machin
John Machin sjmac...@users.sourceforge.net added the comment: @lemburg: failing byte seems rather obvious: first byte that you meet that is not valid in the current state. I don't understand your explanation, especially does not have the high bit set. I think you mean is a valid starter byte

[issue8271] str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0

2010-03-30 Thread John Machin
New submission from John Machin sjmac...@users.sourceforge.net: Unicode 5.2.0 chapter 3 (Conformance) has a new section (headed Constraints on Conversion Processes) after requirement D93. Recent Pythons e.g. 3.1.2 don't comply. Using the Unicode example: print(ascii(b\xc2\x41\x42.decode

Re: subtraction is giving me a syntax error

2010-03-15 Thread John Machin
On Mar 16, 5:43 am, Baptiste Carvello baptiste...@free.fr wrote: Joel Pendery a écrit : So I am trying to write a bit of code and a simple numerical subtraction y_diff = y_diff-H is giving me the error Syntaxerror: Non-ASCII character '\x96' in file on line 70, but no encoding

Re: datelib pythonification

2010-02-21 Thread John Machin
On Feb 21, 12:37 pm, alex goretoy agore...@gmail.com wrote: hello all,     since I posted this last time, I've added a new function dates_diff and [SNIP] I'm rather unsure of the context of this posting ... I'm assuming that the subject datelib pythonification refers to trying to make datelib

Re: parsing an Excel formula with the re module

2010-01-14 Thread John Machin
On Jan 13, 7:15 pm, Paul McGuire pt...@austin.rr.com wrote: On Jan 5, 1:49 pm, Tim Chase python.l...@tim.thechases.com wrote: vsoler wrote: Hence, I need toparseExcel formulas. Can I do it by means only of re (regular expressions)? I know that for simple formulas such as =3*A7+5 it

Re: parsing an Excel formula with the re module

2010-01-14 Thread John Machin
On Jan 14, 2:05 pm, Gabriel Genellina gagsl-...@yahoo.com.ar wrote: En Wed, 13 Jan 2010 05:15:52 -0300, Paul McGuire pt...@austin.rr.com   escribió: vsoler wrote: Hence, I need toparseExcel formulas. Can I do it by means only of re (regular expressions)? This might give the OP a

Re: parsing an Excel formula with the re module

2010-01-14 Thread John Machin
On Jan 15, 3:41 pm, Paul McGuire pt...@austin.rr.com wrote: I never represented that this parser would handle any and all Excel formulas!  But I should hope the basic structure of a pyparsing solution might help the OP add some of the other features you cited, if necessary. It's actually

Re: parsing an Excel formula with the re module

2010-01-12 Thread John Machin
On 12/01/2010 6:26 PM, Chris Withers wrote: John Machin wrote: The xlwt package (of which I am the maintainer) has a lexer and parser for a largish subset of the syntax ... see http://pypi.python.org/pypi/xlwt xlrd, no? A facility in xlrd to decompile Excel formula bytecode into a text

Re: What is built-in method sub

2010-01-11 Thread John Machin
On Jan 12, 7:30 am, Jeremy jlcon...@gmail.com wrote: On Jan 11, 1:15 pm, Diez B. Roggisch de...@nospam.web.de wrote: Jeremy schrieb: On Jan 11, 12:54 pm, Carl Banks pavlovevide...@gmail.com wrote: On Jan 11, 11:20 am, Jeremy jlcon...@gmail.com wrote: I just profiled one of my

Re: Porblem with xlutils/xlrd/xlwt

2010-01-10 Thread John Machin
On Jan 10, 8:51 pm, pp parul.pande...@gmail.com wrote: On Jan 9, 8:23 am, John Machin sjmac...@lexicon.net wrote: On Jan 9, 9:56 pm, pp parul.pande...@gmail.com wrote: On Jan 9, 3:52 am, Jon Clements jon...@googlemail.com wrote: On Jan 9, 10:44 am, pp parul.pande...@gmail.com wrote

Re: How to get many places of pi from Machin's Equation?

2010-01-09 Thread John Machin
On Jan 9, 10:31 pm, Richard D. Moores rdmoo...@gmail.com wrote: Machin's Equation is 4 arctan (1/5) - arctan(1/239) = pi/4 Using Python 3.1 and the math module: from math import atan, pi pi 3.141592653589793 (4*atan(.2) - atan(1/239))*4 3.1415926535897936 (4*atan(.2) -

Re: Porblem with xlutils/xlrd/xlwt

2010-01-09 Thread John Machin
On Jan 9, 9:56 pm, pp parul.pande...@gmail.com wrote: On Jan 9, 3:52 am, Jon Clements jon...@googlemail.com wrote: On Jan 9, 10:44 am, pp parul.pande...@gmail.com wrote: On Jan 9, 3:42 am, Jon Clements jon...@googlemail.com wrote: On Jan 9, 10:24 am, pp parul.pande...@gmail.com

Re: Astronomy--Programs to Compute Siderial Time?

2010-01-07 Thread John Machin
On Jan 7, 2:40 pm, W. eWatson wolftra...@invalid.com wrote: John Machin wrote: What you have been reading is the Internal maintenance specification (large font, near the top of the page) for the module. The xml file is the source of the docs, not meant to be user-legible. What

Re: How do I access what's in this module?

2010-01-07 Thread John Machin
On Jan 8, 12:21 pm, Fencer no.i.d...@want.mail.from.spammers.com wrote: Hello, look at this lxml documentation page:http://codespeak.net/lxml/api/index.html That's for getting details about an object once you know what object you need to use to do what. In the meantime, consider reading the

Re: How do I access what's in this module?

2010-01-07 Thread John Machin
On Jan 8, 2:45 pm, Fencer no.i.d...@want.mail.from.spammers.com wrote: On 2010-01-08 04:40, John Machin wrote: For example:    from lxml.etree import ElementTree    ElementTree.dump(None) Traceback (most recent call last):     File console, line 1, inmodule lxml.etree

Re: TypeError

2010-01-06 Thread John Machin
On Jan 7, 3:29 am, MRAB pyt...@mrabarnett.plus.com wrote: Victor Subervi wrote: ValueError: unsupported format character '(' (0x28) at index 54       args = (unsupported format character '(' (0x28) at index 54,) Apparently that character is a file separator, which I presume is an

Re: 3 byte network ordered int, How To ?

2010-01-06 Thread John Machin
On Jan 7, 5:33 am, Matthew Barnett mrabarn...@mrabarnett.plus.com wrote: mudit tuli wrote: For a single byte, struct.pack('B',int) For two bytes, struct.pack('H',int) what if I want three bytes ? Four bytes and then discard the most-significant byte: struct.pack('I', int)[ : -1]

Re: parsing an Excel formula with the re module

2010-01-06 Thread John Machin
On Jan 6, 6:54 am, vsoler vicente.so...@gmail.com wrote: On 5 ene, 20:21, vsoler vicente.so...@gmail.com wrote: On 5 ene, 20:05, Mensanator mensana...@aol.com wrote: On Jan 5, 12:35 pm, MRAB pyt...@mrabarnett.plus.com wrote: vsoler wrote: Hello, I am acessing an Excel

Re: TypeError

2010-01-06 Thread John Machin
On Jan 7, 11:14 am, John Machin sjmac...@lexicon.net wrote: On Jan 7, 3:29 am, MRAB pyt...@mrabarnett.plus.com wrote: Victor Subervi wrote: ValueError: unsupported format character '(' (0x28) at index 54       args = (unsupported format character '(' (0x28) at index 54,) Apparently

Re: Astronomy--Programs to Compute Siderial Time?

2010-01-06 Thread John Machin
On Jan 7, 11:40 am, W. eWatson wolftra...@invalid.com wrote: W. eWatson wrote: Is there a smallish Python library of basic astronomical functions? There are a number of large such libraries that are crammed with excessive functions not needed for common calculations. It looks like I've

Re: TypeError

2010-01-06 Thread John Machin
On Jan 7, 1:38 pm, Steve Holden st...@holdenweb.com wrote: John Machin wrote: [...] I note that in the code shown there are examples of building an SQL query where the table name is concocted at runtime via the % operator ... key phrases: bad database design (one table per store!), SQL

Re: Significant whitespace

2010-01-03 Thread John Machin
On Jan 2, 10:29 am, Roy Smith r...@panix.com wrote: To address your question more directly, here's a couple of ways Fortran treated whitespace which would surprise the current crop of Java/PHP/Python/Ruby programmers: 1) Line numbers (i.e. the things you could GOTO to) were in column 2-7

Re: creating ZIP files on the cheap

2009-12-23 Thread John Machin
On Dec 24, 7:34 am, samwyse samw...@gmail.com wrote: I've got an app that's creating Open Office docs; if you don't know, these are actually ZIP files with a different extension.  In my case, like many other people, I generating from boilerplate, so only one component (content.xml) of my ZIP

Re: dictionary with tuple keys

2009-12-15 Thread John Machin
Ben Finney ben+python at benfinney.id.au writes: In this case, I'll use ‘itertools.groupby’ to make a new sequence of keys and values, and then extract the keys and values actually wanted. Ah, yes, Zawinski revisited ... itertools.groupby is the new regex :-) Certainly it might be clearer

  1   2   3   4   5   6   7   8   9   10   >