Strange Python internal error

2011-09-16 Thread JKPeck
We have a user on Windows with Python 2.6 who gets this error message when 
executing an import statement.
from extension import Template, Syntax, processcmd 
SystemError: ..\Objects\listobject.c:169: bad argument to internal function

The module can be imported directly via
import extension
with no problem.  And large numbers of other users execute this same code with 
no problem.

Does anybody have a clue as to how this might arise?

TIA,
Jon Peck
-- 
http://mail.python.org/mailman/listinfo/python-list

From recruiting-simplifies+bnccipclbizhxcvmm3zbboeoqb...@googlegroups.com Fri 
Sep 16 06:17:14 2011
Return-path: 

Envelope-to: arch...@mail-archive.com
Delivery-date: Fri, 16 Sep 2011 06:17:14 -0700
Received: from exprod5mx247.postini.com ([64.18.0.167] helo=psmtp.com)
by mail-archive.com with esmtp (Exim 4.69)
(envelope-from 
)
id 1R4YI6-0005d2-4V
for arch...@mail-archive.com; Fri, 16 Sep 2011 06:17:14 -0700
Received: from mail-pz0-f56.google.com ([209.85.210.56]) (using TLSv1) by 
exprod5mx247.postini.com ([64.18.4.10]) with SMTP;
Fri, 16 Sep 2011 06:17:11 PDT
Received: by pzk34 with SMTP id 34sf574262pzk.1
for ; Fri, 16 Sep 2011 06:17:10 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=googlegroups.com; s=beta;
h=x-beenthere:received-spf:from:to:subject:date:message-id
 :mime-version:x-mailer:thread-index:x-cr-hashedpuzzle:x-cr-puzzleid
 :x-original-sender:x-original-authentication-results:precedence
 :mailing-list:list-id:x-google-group-id:list-post:list-help
 :list-archive:sender:list-unsubscribe:content-type:content-language;
bh=WcpZY9CO+G6LAH58OuCK1KKVtzk+QE5vOdtuFO/hCVs=;
b=uAxG1f4Ox9cc/HCZ8XhGxWqGUe8bJx8WnHgsBgDTHVnSY7pKue09fyG5dH0fq6CUTt
 kNCmZBzLPbhGE49WVOTmUV7MsOl5vApYJNpgZQ3k9Qk2s8I4q17PMmZgyo6+HUNYvKV2
 +cgoHlGGmPgyiW3pib14aaHSn4vkgFCySXWWE=
Received: by 10.68.36.225 with SMTP id t1mr412493pbj.2.1316178991245;
Fri, 16 Sep 2011 06:16:31 -0700 (PDT)
X-BeenThere: recruiting-simplif...@googlegroups.com
Received: by 10.68.158.84 with SMTP id ws20ls1421899pbb.4.gmail; Fri, 16 Sep
 2011 06:16:30 -0700 (PDT)
Received: by 10.68.58.168 with SMTP id s8mr1028899pbq.15.1316178990185;
Fri, 16 Sep 2011 06:16:30 -0700 (PDT)
Received: by 10.68.58.168 with SMTP id s8mr1028884pbq.15.1316178990055;
Fri, 16 Sep 2011 06:16:30 -0700 (PDT)
Received: from mail-pz0-f43.google.com (mail-pz0-f43.google.com [209.85.210.43])
by gmr-mx.google.com with ESMTPS id 
j4si14655506pbi.2.2011.09.16.06.16.29
(version=TLSv1/SSLv3 cipher=OTHER);
Fri, 16 Sep 2011 06:16:30 -0700 (PDT)
Received-SPF: neutral (google.com: 209.85.210.43 is neither permitted nor 
denied by best guess record for domain of m...@ktekresourcing.com) 
client-ip=209.85.210.43;
Received: by mail-pz0-f43.google.com with SMTP id 13so1805922pzd.2
for ; Fri, 16 Sep 2011 06:16:29 -0700 (PDT)
Received: by 10.68.0.36 with SMTP id 4mr2098046pbb.382.1316178989168;
Fri, 16 Sep 2011 06:16:29 -0700 (PDT)
Received: from DellPC ([113.193.45.201])
by mx.google.com with ESMTPS id e8sm33010136pbc.8.2011.09.16.06.16.12
(version=SSLv3 cipher=OTHER);
Fri, 16 Sep 2011 06:16:27 -0700 (PDT)
From: "Mike Vaz" 
To: 
Subject: Recruiting-Simplifies Very Urgent Requirement For A QA  Tester With
 Healthcare Experience
Date: Fri, 16 Sep 2011 08:16:09 -0500
Message-ID: <00fc01cc7472$d6dc0920$84941b60$@com>
MIME-Version: 1.0
X-Mailer: Microsoft Office Outlook 12.0
Thread-Index: AcxzKftZ7CRMRgJsTjyI5WWpCpCzEgABHlhQACN3r2AALZqaIA==
x-cr-hashedpuzzle: J/c= 3I8= Akfw CBhK DSF+ DW1b ERG3 Ebfb GWMT GwsQ G+E6 HBLJ 
Hc/o HhGE JPiV 
K1YD;1;bQBpAGsAZQBAAGsAdABlAGsAcgBlAHMAbwB1AHIAYwBpAG4AZwAuAGMAbwBtAA==;Sosha1_v1;7;{E9D7BFD9-C5DF-4FF8-AFBB-A097F8E324D8};bQBpAGsAZQBAAGsAdABlAGsAcgBlAHMAbwB1AHIAYwBpAG4AZwAuAGMAbwBtAA==;Fri,
 16 Sep 2011 13:16:07 
GMT;VgBlAHIAeQAgAFUAcgBnAGUAbgB0ACAAUgBlAHEAdQBpAHIAZQBtAGUAbgB0ACAARgBvAHIAIABBACAAUQBBACAAIABUAGUAcwB0AGUAcgAgAFcAaQB0AGgAIABIAGUAYQBsAHQAaABjAGEAcgBlACAARQB4AHAAZQByAGkAZQBuAGMAZQA=
x-cr-puzzleid: {E9D7BFD9-C5DF-4FF8-AFBB-A097F8E324D8}
X-Original-Sender: m...@ktekresourcing.com
X-Original-Authentication-Results: gmr-mx.google.com; spf=neutral (google.com:
 209.85.210.43 is neither permitted nor denied by best guess record for domain
 of m...@ktekresourcing.com) smtp.mail=m...@ktekresourcing.com
Precedence: list
Mailing-list: list recruiting-simplif...@googlegroups.com; contact 
recruiting-simplifies+own...@googlegroups.com
List-ID: 
X-Google-Group-Id: 541384909353
List-Post: ,
 
List-Help: , 

List-Archive: 
Sender: recruiting-simplif...@googlegroups.com
Li

Installing Python Apps on Mac Lion

2011-06-24 Thread JKPeck
The Lion version of the OS on the Mac comes with Python 2.7 installed, but it 
is in /System/Library/Frameworks/..., and this area is not writable by third 
party apps.

So is there a consensus on what apps that typically install under the Python 
site-packages directory should do in this situation?  Installing Python from 
python.org puts it in the writable area /Library/Frameworks/Python.framework.

So, what should a Python app installer do?

Thanks
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Use company name for module

2010-11-29 Thread JKPeck
On Nov 29, 1:41 pm, Chris Withers  wrote:
> On 12/11/2010 15:50, Robert Kern wrote:
>
>
>
> > On 11/12/10 8:12 AM, Micah Carrick wrote:
> >> My company is working on releasing some of our code as open-source python
> >> modules. I don't want my "foo" module conflicting with other modules
> >> called
> >> "foo" on PyPi or github or a user's system. Is there anything wrong,
> >> from a
> >> conventions standpoint, with having modules like company.foo and
> >> company.bar
> >> even if foo and bar are not necessarily related other than being
> >> released by us?
> >> I really don't like the cryptic module names or things like foo2 and
> >> the like.
>
> > Yes, using namespace packages. You need to use `distribute` in your
> > setup.py in order to accomplish this.
>
> >http://pypi.python.org/pypi/distribute/
> >http://packages.python.org/distribute/setuptools.html#namespace-packages
>
> ...or setuptools.
>
> ...or just pick a different naming scheme, the Pyramid guys have gone for:
>
> company_foo
>
> ...and I'm inclined to do the same.
>
> Chris
>
> --
> Simplistix - Content Management, Batch Processing & Python Consulting
>             -http://www.simplistix.co.uk

You might want to check with your company legal folks before adopting
a naming rule
base on the company name.  Some companies whose names are trademarked
will not allow their name to be used in
certain contexts, possibly including this.

(I am not a lawyer!)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to Suppress Interactive Assignment to "_"

2010-01-01 Thread JKPeck
On Jan 1, 10:06 am, Peter Otten <__pete...@web.de> wrote:
> JKPeck wrote:
> > The gettext module uses the convention of defining a function named
> > "_" that maps text into its translation.
> > This conflicts with the automatic interactive interpreter assignment
> > of expressions to a variable with that same name.
>
> > While if you are careful, you can avoid that assignment while
> > debugging, and you can choose a different function for gettext, this
> > conflict is a nuisance.  I am looking for a way to suppress the
> > expression assignment to _ or to change the name of the variable
> > assigned to.  Is this possible?  Using Python 2.6.
>
> $ cat displayhook.py
> import sys
> import __builtin__ as builtin
>
> def displayhook(obj):
>     if obj is not None:
>         builtin._last_result = obj
>         print repr(obj)
>
> sys.displayhook = displayhook
> $ python -i displayhook.py>>> 42
> 42
> >>> _
>
> Traceback (most recent call last):
>   File "", line 1, in 
> NameError: name '_' is not defined>>> _last_result
>
> 42

Thanks.  It's just what I needed.
-Jon Peck
-- 
http://mail.python.org/mailman/listinfo/python-list


How to Suppress Interactive Assignment to "_"

2010-01-01 Thread JKPeck
The gettext module uses the convention of defining a function named
"_" that maps text into its translation.
This conflicts with the automatic interactive interpreter assignment
of expressions to a variable with that same name.

While if you are careful, you can avoid that assignment while
debugging, and you can choose a different function for gettext, this
conflict is a nuisance.  I am looking for a way to suppress the
expression assignment to _ or to change the name of the variable
assigned to.  Is this possible?  Using Python 2.6.

TIA,
Jon Peck
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problems with gettext and msgfmt

2009-12-16 Thread JKPeck
On Dec 15, 9:12 pm, JKPeck  wrote:
> I'm using Python 2.6 on Windows and having trouble with the charset in
> gettext.  It seems to be so broken that I must be missing something.
>
> When I run msgfmt.py, as far as I can see it writes no charset
> information into the mo file.  The actual po files are in utf-8 in
> this case and have a charset declaration.
>
> Then when ,_parse in gettext loads the messages, it does no conversion
> to Unicode, because it has no charset information.  So the message
> dictionary is actually in utf-8 despite the comment in the code
> # Note: we unconditionally convert both msgids and msgstrs to
>             # Unicode using the character encoding specified in the
> charset
>             # parameter of the Content-Type header.
>
> Then ugettext tries to just return the translated message, which is
> not in Unicode, or to convert to Unicode, which fails because the
> unicode call is not specifying any encoding.
>
> The _parse code seems to expect to produce a Unicode translation
> dictionary, and gettext expects to encode Unicode into the current
> code page, but the message dictionary never gets mapped to Unicode in
> the first place.
>
> What I want is simply to use utf-8 po files and get translations in
> Unicode.
>
> TIA for any suggestions.
>
> -Jon Peck

Never mind.  I figured this out.  The problem is that a line such as
_("")
in the source that is scanned causes all the meta information to be
lost in the mo file.  Once I changed that code, I get the expected
result.

-- 
http://mail.python.org/mailman/listinfo/python-list


Problems with gettext and msgfmt

2009-12-15 Thread JKPeck
I'm using Python 2.6 on Windows and having trouble with the charset in
gettext.  It seems to be so broken that I must be missing something.

When I run msgfmt.py, as far as I can see it writes no charset
information into the mo file.  The actual po files are in utf-8 in
this case and have a charset declaration.

Then when ,_parse in gettext loads the messages, it does no conversion
to Unicode, because it has no charset information.  So the message
dictionary is actually in utf-8 despite the comment in the code
# Note: we unconditionally convert both msgids and msgstrs to
# Unicode using the character encoding specified in the
charset
# parameter of the Content-Type header.

Then ugettext tries to just return the translated message, which is
not in Unicode, or to convert to Unicode, which fails because the
unicode call is not specifying any encoding.

The _parse code seems to expect to produce a Unicode translation
dictionary, and gettext expects to encode Unicode into the current
code page, but the message dictionary never gets mapped to Unicode in
the first place.

What I want is simply to use utf-8 po files and get translations in
Unicode.

TIA for any suggestions.

-Jon Peck
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: csv module and None values

2009-08-29 Thread JKPeck
On Aug 25, 8:49 am, Peter Otten <__pete...@web.de> wrote:
> JKPeck wrote:
> > On Aug 24, 10:43 pm, John Yeung  wrote:
> >> On Aug 24, 5:00 pm, Peter Otten <__pete...@web.de> wrote:
>
> >> > If I understand you correctly the csv.writer already does
> >> > what you want:
>
> >> > >>> w.writerow([1,None,2])
> >> > 1,,2
>
> >> > just sequential commas, but that is the special treatment.
> >> > Without it the None value would be converted to a string
> >> > and the line would look like this one:
>
> >> > 1,None,2
>
> >> No, I think he means he is getting
>
> >> >>> w.writerow([1,None,2])
>
> >> 1,"",2
>
> >> He evidently wants to quote "all" strings, but doesn't want None to be
> >> considered a string.
>
> >> John
>
> > Exactly so.  The requirement of the receiving program, which is out of
> > my control, is that all strings be quoted but a None in a numeric
> > field result in the ,, output rather than "".  Excel quotes strings
> > conditionally, which doesn't do what is needed in this case.  For
> > QUOTE_NONNUMERIC to quote None values makes some sense, but it gets in
> > the way of representing missing values in a numeric field.  It would
> > be nice to have a choice here in the dialects.
>
> > I thought of replacing the None values with float(nan), since that has
> > a numeric type, but unfortunately that results in writing the string
> > (unquoted) nan for the value.  So the sentinel approach seems to be
> > the best I can do.
>
> How about:
>
> >>> import csv, sys
> >>> class N(int):
>
> ...     def __str__(self): return ""
> ...>>> pseudo_none = N()
> >>> w = csv.writer(sys.stdout, quoting=csv.QUOTE_NONNUMERIC)
> >>> w.writerow([1, "foo", pseudo_none, "bar"])
>
> 1,"foo",,"bar"
>
> Peter

Clever.  Thanks,
Jon
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: csv module and None values

2009-08-25 Thread JKPeck
On Aug 24, 10:43 pm, John Yeung  wrote:
> On Aug 24, 5:00 pm, Peter Otten <__pete...@web.de> wrote:
>
> > If I understand you correctly the csv.writer already does
> > what you want:
>
> > >>> w.writerow([1,None,2])
> > 1,,2
>
> > just sequential commas, but that is the special treatment.
> > Without it the None value would be converted to a string
> > and the line would look like this one:
>
> > 1,None,2
>
> No, I think he means he is getting
>
> >>> w.writerow([1,None,2])
>
> 1,"",2
>
> He evidently wants to quote "all" strings, but doesn't want None to be
> considered a string.
>
> John

Exactly so.  The requirement of the receiving program, which is out of
my control, is that all strings be quoted but a None in a numeric
field result in the ,, output rather than "".  Excel quotes strings
conditionally, which doesn't do what is needed in this case.  For
QUOTE_NONNUMERIC to quote None values makes some sense, but it gets in
the way of representing missing values in a numeric field.  It would
be nice to have a choice here in the dialects.

I thought of replacing the None values with float(nan), since that has
a numeric type, but unfortunately that results in writing the string
(unquoted) nan for the value.  So the sentinel approach seems to be
the best I can do.

Thanks,
Jon
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: csv module and None values

2009-08-24 Thread JKPeck
On Aug 24, 11:30 am, JKPeck  wrote:
> I'm trying to get the csv module (Python 2.6) to write data records
> like Excel.  The excel dialect isn't doing it.  The problem is in
> writing None values.  I want them to result in just sequential commas
> - ,, but csv treats None specially, as the doc says,
>
> "To make it as easy as possible to interface with modules which
> implement the DB API, the value None is written as the empty string."
>
> I need strings to be quoted but not None values.  Is there any way to
> get around this special None treatment?
>
> TIA,
> Jon Peck

Solved the problem myself by giving a writer class to csv.writer that
looks for sentinel markers inserted in place of None and wipes them
out before writing to a file.  Pretty ugly, but it works.

Regards,
Jon Peck
-- 
http://mail.python.org/mailman/listinfo/python-list


csv module and None values

2009-08-24 Thread JKPeck
I'm trying to get the csv module (Python 2.6) to write data records
like Excel.  The excel dialect isn't doing it.  The problem is in
writing None values.  I want them to result in just sequential commas
- ,, but csv treats None specially, as the doc says,

"To make it as easy as possible to interface with modules which
implement the DB API, the value None is written as the empty string."

I need strings to be quoted but not None values.  Is there any way to
get around this special None treatment?

TIA,
Jon Peck
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to get rid of pyc files ?

2009-05-24 Thread JKPeck
On May 24, 4:08 pm, Dave Angel  wrote:
> pythoncuri...@gmail.com wrote:
> > On May 24, 3:58 pm, John Machin  wrote:
>
> >> What problems? Like avoiding having to recompile your .py files makes
> >> your app run too fast?
>
> > There are real problems with this. I'm having similar problems when
> > switching
> > between Solaris and Windows.
> > The code is in clearcase, which uses some sort of network file
> > system.
> > In our case, that means that I'll be accessing the same directories
> > from both
> > platforms, so .pyc-files from one platform will be visible when I run
> > the
> > code on the other platform.
>
> > The .pyc-file will contain a string pointing to the file is was built
> > from.
> > The path to that file will be different under different platforms, so
> > when the
> > string is used, there will be error printouts.
> > At least I think that's the problem, the error printouts contains
> > paths that
> > are only valid on the other platform.
> > I don't have access to those computers atm, so I can't show the exact
> > message.
>
> > The only solution I've seen is to make sure to clean up the .pyc files
> > each
> > time I switch platform.
>
> Is Clearcase still around?  I hope it works better than it did in 1992.
>
> Somebody else has already pointed out that you can tell Python not to
> create those files (during your development stages).
>
> But if that won't work for some reason, perhaps you can do something
> with symbolic links.  I remember that RCS, for example, required that
> the archives be located in a directory immediately below the one with
> the sources.  So in order to share those archives, you made the
> subdirectory actually a link to a common network location.
>
> Your case would seem to be the opposite.  But I don't know enough about
> the current state of either Solaris or Clearcase to know the best answer.
>
> Perhaps Clearcase supports some form of "exclusion" parameter, wherein
> you say not to do version control on files with certain patterns, like .pyc

ClearCase gives you tremendous control over what can be seen at any
point.  Assuming that you are using dynamic views, the simplest way to
fix this is to use a different view for each platform.  They would
both be able to see the py files (although you could control that with
the configspec)as checked in, but the pyc files would not be checked
in and would automatically be view private.  So with two different
views, each platform would only see its own pyc files.

HTH,
Jon Peck
-- 
http://mail.python.org/mailman/listinfo/python-list


Using the backing store with mmap

2008-06-25 Thread JKPeck
According to the mmap.mmap 2.5 documentation,
"Changed in version 2.5: To map anonymous memory, -1 should be passed
as the fileno along with the length."

I would like to use shared memory to communicate between two processes
that otherwise have no way to communicate, but I couldn't find a way
to share anonymous memory.  (I can use file names agreed on by
convention, but the file is really irrelevant, and I'd prefer to
eliminate it.)  Is this possible?  What is the lifetime of this shared
memory?  Is it in fact private to the creating process, or is it
shared among all (Python) processes?  Does it need to be flushed by a
writing process?  How do the access flags relate to this?  If I create
two such items, are they independent, or is it all one pool?

TIA,
Jon Peck
--
http://mail.python.org/mailman/listinfo/python-list


Re: Looking for library to estimate likeness of two strings

2008-02-07 Thread JKPeck
On Feb 7, 6:11 am, Lee Capps <[EMAIL PROTECTED]> wrote:
> At 14:01 Wed 06 Feb 2008, [EMAIL PROTECTED] wrote:
>
> >Are there any Python libraries implementing measurement of similarity
> >of two strings of Latin characters?
>
> >I'm writing a script to guess-merge two tables based on people's
> >names, which are not necessarily spelled the same way in both tables
> >(especially the given names).  I would like some function that would
> >help me make the best guess.
>
> >Many thanks in advance!
>
> I used difflib.get_close_matches for something similar:
>
> http://docs.python.org/lib/module-difflib.html
>
> HTH.
>
> --
> Lee Capps
> Technology Specialist
> CTE Resource Center

Algorithms typically used for name comparisons include soundex,
nysiis, and levenshtein distance.  The last is more general and
variations are used in spell checkers.  You can probably Google for
Python versions.  You can find implementations of these comparison
functions at
www.spss.com/devcentral in the extendedTransforms.py module.
(Requires a login but free).

HTH,
Jon Peck
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Mysterious xml.sax Encoding Exception

2008-02-05 Thread JKPeck
On Feb 4, 4:09 pm, John Machin <[EMAIL PROTECTED]> wrote:
> On Feb 5, 9:02 am, JKPeck <[EMAIL PROTECTED]> wrote:
>
>
>
> > On Feb 2, 12:56 am, Jeroen Ruigrok van der Werven <[EMAIL PROTECTED]
>
> > nomine.org> wrote:
> > > -On [20080201 19:06], JKPeck ([EMAIL PROTECTED]) wrote:
>
> > > >In both of these cases, there are only plain, 7-bit ascii characters
> > > >in the xml, and it really is valid utf-16 as far as I can tell.
>
> > > Did you mean to say that the only characters they used in the UTF-16 
> > > encoded
> > > file are characters from the Basic Latin Unicode block?
>
> > It appears that the root cause of this problem is indeed passing a
> > Unicode XML string to xml.sax.parseString with an encoding declaration
> > in the XML of utf-16.  This works with the standard distribution on
> > Windows.
>
> It did NOT work for me with the standard 2.5.1 Windows distribution --
> see the code + output that I posted.
>
> >  It does not work with ActiveState on Windows even though
> > both distributions report
> > 64K for sys.maxunicode.
>
> > So I don't know why the results are different, but the problem is
> > solved by encoding the Unicode string into utf-16 before passing it to
> > the parser.

Interesting.  In the course of installing and testing with
ActiveState, I upgraded from the standard distribution 2.5.0 to
2.5.1.  The former worked; the latter does not (with the original
code).  So that ..1 seems to matter here, and that probably accounts
for why ActiveState raised the exception and the standard 2.5.0 did
not.

-Jon
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Mysterious xml.sax Encoding Exception

2008-02-04 Thread JKPeck
On Feb 2, 12:56 am, Jeroen Ruigrok van der Werven <[EMAIL PROTECTED]
nomine.org> wrote:
> -On [20080201 19:06], JKPeck ([EMAIL PROTECTED]) wrote:
>
> >In both of these cases, there are only plain, 7-bit ascii characters
> >in the xml, and it really is valid utf-16 as far as I can tell.
>
> Did you mean to say that the only characters they used in the UTF-16 encoded
> file are characters from the Basic Latin Unicode block?
>
> --
> Jeroen Ruigrok van der Werven  / asmodai
> イェルーン ラウフロック ヴァン デル ウェルヴェンhttp://www.in-nomine.org/|http://www.rangaku.org/
> We have met the enemy and they are ours...

It appears that the root cause of this problem is indeed passing a
Unicode XML string to xml.sax.parseString with an encoding declaration
in the XML of utf-16.  This works with the standard distribution on
Windows.  It does not work with ActiveState on Windows even though
both distributions report
64K for sys.maxunicode.

So I don't know why the results are different, but the problem is
solved by encoding the Unicode string into utf-16 before passing it to
the parser.

Thanks to all for helping to track this down.

Regards,
Jon Peck
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Mysterious xml.sax Encoding Exception

2008-02-01 Thread JKPeck
On Feb 1, 1:51 pm, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> > They sent me the actual file, which was created on Windows,  as an
> > email attachment.  They had also sent the actual dataset from which
> > the XML was generated so that I could generate it myself using the
> > same version of our app as the user has.  I did that but did not get
> > an exception.
>
> So are you sure you open the file in binary mode on Windows?
>
> Regards,
> Martin

In the real case, the xml never goes through a file but is handed
directly to the parser.  The api return a Python Unicode string
(utf-16).  For the file the user sent, if I open it in binary mode, it
still has a BOM; otherwise the BOM is removed.  But either version
works on my system.

The basic fact, though, remains, the same code works for me with the
same input but not for two particular users (out of hundreds).

Regards,
Jon
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Mysterious xml.sax Encoding Exception

2008-02-01 Thread JKPeck
On Feb 1, 1:22 pm, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> > In both of these cases, there are only plain, 7-bit ascii characters
> > in the xml, and it really is valid utf-16 as far as I can tell.
>
> What do you mean by "7-bit ascii characters"? If it means what I think
> it means (namely, a sequence of bytes whose values are between 1 and
> 127), then it is *not* valid utf-16.
>
> > Now here is the hard part: This never happens to me, and having gotten
> > the actual xml content from one of the users and fed it to the parser,
> > I don't get the exception.
>
> > What could be going on?  We are all on Python 2.5 (and all on an
> > English locale).
>
> What operating system do they use, and how do they send you the file
> for verification? Can you have them run
>
> print repr(open(filename, "rb").read(10))
>
> and send you its output?
>
> Regards,
> Martin

They sent me the actual file, which was created on Windows,  as an
email attachment.  They had also sent the actual dataset from which
the XML was generated so that I could generate it myself using the
same version of our app as the user has.  I did that but did not get
an exception.
-- 
http://mail.python.org/mailman/listinfo/python-list


Mysterious xml.sax Encoding Exception

2008-02-01 Thread JKPeck
I have a module that uses xml.sax and feeds it a string of xml as in
xml.sax.parseString(dictfile,handler)

The xml is always encoded in utf-16, and the XML string always starts
with


This almost always works fine, but two users of this module get an
exception whatever input they use it on.  (The actual xml is generated
by an api in our application that returns an xml version of metadata
associated with the application's data.)

The exception is
xml.sax._exceptions.SAXParseException: :1:30: encoding
specified in XML declaration is incorrect.

In both of these cases, there are only plain, 7-bit ascii characters
in the xml, and it really is valid utf-16 as far as I can tell.

Now here is the hard part: This never happens to me, and having gotten
the actual xml content from one of the users and fed it to the parser,
I don't get the exception.

What could be going on?  We are all on Python 2.5 (and all on an
English locale).

Any suggestions would be appreciated.
-Jon Peck
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Cost of "unicode(s)" where s is Unicode

2008-01-06 Thread JKPeck
On Jan 6, 9:06 am, John Nagle <[EMAIL PROTECTED]> wrote:
>Does
>
> text = unicode(text)
>
> make a copy of a Unicode string, or is that essentially a
> free operation if the input is already Unicode?
>
> John Nagle

>>> u = u"abc"
>>> uu = unicode(u)
>>> u is uu
True
>>> s = "abc"
>>> ss = unicode(s)
>>> s is ss
False

HTH,
Jon Peck
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Wrapping stdout in a codec

2007-10-24 Thread JKPeck
On Oct 22, 12:20 am, Marc 'BlackJack' Rintsch <[EMAIL PROTECTED]> wrote:
> On Mon, 22 Oct 2007 02:41:17 +, JKPeck wrote:
> > We want to wrap the stdout device in a codec in order to decode output
> > transparently according to a particular code page (which might not be
> > the system code page).  However, codec.open requires a filename, and
> > stdout may be a tty or otherwise anonymous.  How can we accomplish
> > this wrapping?
>
> The `codecs` module has more than just the `codecs.open()` function.  Try
> something like this::
>
>   sys.stdout = codecs.getwriter('utf-8')(sys.stdout)
>
> Ciao,
> Marc 'BlackJack' Rintsch

Thanks, codecs.getwriter is just the ticket.  Our app may not be
running in the system encoding, so the file system encoding is not
appropriate.

-- 
http://mail.python.org/mailman/listinfo/python-list


Wrapping stdout in a codec

2007-10-21 Thread JKPeck
We want to wrap the stdout device in a codec in order to decode output
transparently according to a particular code page (which might not be
the system code page).  However, codec.open requires a filename, and
stdout may be a tty or otherwise anonymous.  How can we accomplish
this wrapping?  Our application may be loaded into a Python program
that has already set up stdout.

TIA,
Jon Peck

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Module Indexing

2006-12-07 Thread JKPeck
Thanks.  I've gotten good results with epydoc pretty quickly.  It hangs
on one module just burning cpu time, but I've left that one out for
now.


Diez B. Roggisch wrote:
> JKPeck schrieb:
> > We are interested in building a module index for our libraries similar
> > to the global module index on the Python site.  Is there a tool/script
> > available that builds something similar to that automatically?  We
> > would probably want the result to be an html document.
> 
> Several, e.g. epydoc
> 
> Diez

-- 
http://mail.python.org/mailman/listinfo/python-list


Module Indexing

2006-12-06 Thread JKPeck
We are interested in building a module index for our libraries similar
to the global module index on the Python site.  Is there a tool/script
available that builds something similar to that automatically?  We
would probably want the result to be an html document.

TIA,
Jon Peck

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Character Encodings and display of strings

2006-11-13 Thread JKPeck
It seemed to me that this sentence

For many types, this function makes an attempt to return a string that
would yield an object with the same value when passed to eval().

might mean that the encoding setting of the source file might influence
how repr represented the contents of the string.  Nothing to do with
Unicode.  If a source file could have a declared encoding of, say,
cp932 via the # coding comment, I thought there was a chance that eval
would respond to that, too.


Diez B. Roggisch wrote:
> JKPeck wrote:
>
> > Thanks for the quick answer.  I thought repr was involved here, but
> > when I use repr explicitly I get a notation where the backslashes are
> > escaped.  I also though that with the encoding explictily declared in
> > the source, that repr would take that into account and use the
> > character form, but obviously it doesn't.
>
> The encoding in the source has nothing to do with that. How should an
> encoding (and possibly a gazillion different ones in gazillion other
> sourcefiles of yours) influence the list repr code?
>
> The encoding in the source-file is solely used to correctly parse unicode
> literals, as these need a specific encoding to be generated from the
> byte-string they are in the sourcecode.
> 
> Diez

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Character Encodings and display of strings

2006-11-13 Thread JKPeck
Thanks for the quick answer.  I thought repr was involved here, but
when I use repr explicitly I get a notation where the backslashes are
escaped.  I also though that with the encoding explictily declared in
the source, that repr would take that into account and use the
character form, but obviously it doesn't.


Fredrik Lundh wrote:
> "JKPeck" wrote:
>
> >I am trying to understand why, with nonwestern strings, I sometimes get
> > a hex display and sometimes get the string printed as characters.
> >
> > With my Python locale set to Japanese and with or without a # coding of
> > cp932 (this is Windows) at the top of the file, I read a list of
> > Japanese strings into a list, say, catlis.
> >
> > With this code
> > for item in catlis:
> >   print item
> > print catlis
> > print " ".join(catlis)
> >
> > the first print (print item) displays Japanese text as characters..
> > The second print (print catlis) displays a list with the double byte
> > characters in hex notation.
> > The third print (print " ".join(catlis)) prints a combined string of
> > Japanese characters properly.
> >
> > According to the print documentation,
> > "If an object is not a string, it is first converted to a string using
> > the rules for string conversions"
> >
> > but the result is different with a list of strings.
>
> a list is not a string, so it's converted to one using the standard list 
> representation
> rules -- which is to do repr() on all the items, and add brackets and commas 
> as
> necessary.
>
> for some more tips on printing, see:
> 
> http://effbot.org/zone/python-list.htm#printing
> 
> 

-- 
http://mail.python.org/mailman/listinfo/python-list


Character Encodings and display of strings

2006-11-13 Thread JKPeck
I am trying to understand why, with nonwestern strings, I sometimes get
a hex display and sometimes get the string printed as characters.

With my Python locale set to Japanese and with or without a # coding of
cp932 (this is Windows) at the top of the file, I read a list of
Japanese strings into a list, say, catlis.

With this code
for item in catlis:
print item
print catlis
print " ".join(catlis)

the first print (print item) displays Japanese text as characters..
The second print (print catlis) displays a list with the double byte
characters in hex notation.
The third print (print " ".join(catlis)) prints a combined string of
Japanese characters properly.

According to the print documentation,
"If an object is not a string, it is first converted to a string using
the rules for string conversions"

but the result is different with a list of strings.

The hex display looks like this:
['id', '\x90\xab\x95\xca', '\x90\xb6\x94N\x8c\x8e\x93\xfa',
'\x8fA\x8aw\x94N\x90\x94', '\x90E\x8e\xed', '\x8b\x8b\x97^',
'\x8f\x89\x94C\x8b\x8b', '\x8d\xdd\x90\xd0\x8c\x8e\x90\x94',
'\x90E\x96\xb1\x8co\x97\xf0', '\x90l\x8e\xed']

and correctly shows the hex values of the Japanese characters.

Why are these different?

TIA,
Jon Peck

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Sets and Membership Tests

2006-07-12 Thread JKPeck
Thanks for the advice.  Once assured that __hash__ etc was the right
route, I found that using hash() instead of object.__hash__() gave me
stable hash valules.  (I am hashing strings that I know to be unique.)

The "no luck" situation was that a set would accept the same object
multiple times, not recognizing that it was truly the same object.


Nick Vatamaniuc wrote:
> JK,
>
> You are correct to implement __hash__ and __eq__. The problem is how
> you implemented them. Usually your __eq__ method should compare the
> necessary attributes of the objects for equality. The __hash__ should
> return a 32-bit integer. Your best bet is probably to return a hash of
> hashes of your attributes that are used in equality comparison. What
> this means is that your attributes used to produce the __hash__ should
> also be hashable. This is important yet not immediatly obvious. So you
> could for example return hash( (attribute1, attribute2, attribute3) ),
> where attribute1, attribute2, attribute3 are all hashable.
>  Of course, you provided no code and no error messages (the
> 'SoFarNoLuck' exception is not descriptive enough ; )
>
> Hope this helps
>
>
>
> JKPeck wrote:
> > I would like to be able use sets where the set members are objects of a
> > class I wrote.
> > I want the members to be distinguished by some of the object content,
> > but I have not figured out how a set determines whether two (potential)
> > elements are identical.  I tried implementing __eq__ and __ne__ and
> > __hash__ to make objects with identical content behave as identical for
> > set membership, but so far no luck.
> >
> > I could subclass set if necessary, but I still don't know what I would
> > need to override.
> > 
> > TIA for any advice you can offer.

-- 
http://mail.python.org/mailman/listinfo/python-list


Sets and Membership Tests

2006-07-11 Thread JKPeck
I would like to be able use sets where the set members are objects of a
class I wrote.
I want the members to be distinguished by some of the object content,
but I have not figured out how a set determines whether two (potential)
elements are identical.  I tried implementing __eq__ and __ne__ and
__hash__ to make objects with identical content behave as identical for
set membership, but so far no luck.

I could subclass set if necessary, but I still don't know what I would
need to override.

TIA for any advice you can offer.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python and Java

2006-03-07 Thread JKPeck
Thanks for these suggestions.  To be clear, we already have a Python
2.4 minimum requirement for other reasons, and we are looking for a
long-term solution so that as Python advances, the scripting solution
can keep up in a timely way.

Since the Java code is for a very large, complex application, we need
to bring Python to Java rather than Java to Python.

We will take a look at some of the other ideas.

-- 
http://mail.python.org/mailman/listinfo/python-list


Python and Java

2006-03-06 Thread JKPeck
Suppose you have an application written in Java, and you want to enable
other applications or processes written in Python to communicate with
it, i.e., to use Python as a scripting language for the application.
On Windows you could do this with COM and various addons such as
J-Integra and Mark Hammond's libraries.

How would you do this if you want a mechanism that is portable across
Windows, Linux, Mac, and Unix?

Any ideas?  Jython would be a natural candidate, but it is stuck at
Python 2.1 and seems to have an uncertain future.

Thanks in advance.

-- 
http://mail.python.org/mailman/listinfo/python-list