Re: why isn't Unicode the default encoding?

2006-03-20 Thread Matt Goodall
John Salerno wrote:
 Martin v. Löwis wrote:
 
 The real problem is that the Python string type is used to represent
 two very different concepts: bytes, and characters. You can't just drop
 the current Python string type, and use the Unicode type instead - then
 you would have no good way to represent sequences of bytes anymore.
 Byte sequences occur more often than you might think: a ZIP file, a
 MS Word file, a PDF file, and even an HTTP conversation are represented
 through byte sequences.

 So for a byte sequence, internal representation is important; for a
 character string, it is not. Now, for historical reasons, the Python
 string literals create byte strings, not character strings. Since we
 cannot know whether a certain string literal is meant to denote bytes
 or characters, we can't just change the interpretation.
 
 Interesting. So then the read() method, if given a numeric argument for 
 bytes to read, would act differently depending on if you were using 
 Unicode or not? As it is now, it seems to equate the bytes with number 
 of characters, but if the document was written using Unicode characters, 
 is it possible that read(2) might only pull out one character?

Exactly. read(2) might pull out one character, or only half a character.
It all depends on the encoding of the data you're reading.

If you're reading or writing text to a file (or anywhere, for that
matter) you need to know the unicode encoding of the file's content to
read it correctly.

Fortunately, the codecs module makes the whole process relatively painless:

 import codecs
 f = open(a_utf8_encoded_file.txt)
 stream = codecs.getreader('utf-8')(f)
 c = stream.read(1)

The 'stream' works on unicode characters so 'c' is a unicode instance,
i.e. a whole textual character.

- Matt

-- 
 __
/  \__ Matt Goodall, Pollenation Internet Ltd
\__/  \w: http://www.pollenation.net
  __/  \__/e: [EMAIL PROTECTED]
 /  \__/  \t: +44 (0)113 2252500
 \__/  \__/
 /  \  Any views expressed are my own and do not necessarily
 \__/  reflect the views of my employer.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Any python HTML generator libs?

2006-03-10 Thread Matt Goodall
Steve Holden wrote:
 Sullivan WxPyQtKinter wrote:
 
Hi, everyone.  Simply put, what I need most now is a python lib to
generate simple HTML.

I am now using XML to store my lab report records. I found python
really convinient to manipulate XML, so I want to make a small on-line
CGI program to help my colleagues to build their lab report records
into XML, for storage, HTML display (for others to browse) and search.

With python's standard lib, the DOM object could realize the XML
storage and search quite easily, but for HTML generation,  it is a
great headache.

I tried to turn to the in-line server-side python script PSP(something
like asp and php) instead of CGI. However, since the report data is
always of very complex data structures, it is really hard to write most
things in-line. For example, a PCR reaction is expected to be shown in
this format or alike on a web browser:

PCR
Sample: Sm1032
Operater: SullivanZ
TimeStamp: hh:mm mm-dd-
Reaction:
   Reagent1:
 Name:
 Concentration: mM
 Volumn:XXX uL
   Reagent2:



Since there are hundreds of PCR reaction and other operations in the
lab report, in-line PSP is not a suitable solution. But writing HTML
directly with print statement is a great pain.

Will XSTL be useful? Is my problem somewho related with XML-SIG?
Looking forward to your precious suggestion.

 
 The triple-quoted string with string substitution is your friend. Try 
 writing something(s) like:
 
 results = {'secnum': 1, 'type': 'string', 'color': 'blue'}
 
 print \
 h1Section %(secnum)s/h1
 pElements of type %(type)s should be coloured %(color)s/p
  % results

Don't forget that you may need to escape the application's data for
inclusion in HTML:

results = {'secnum': 1, 'type': 'string', 'color': 'blue',
'user':'Matt Goodall [EMAIL PROTECTED]'}

print \
h1Section %(secnum)s/h1
pElements of type %(type)s should be coloured %(color)s/p
pContributed by: %(user)s/p
 % results

Will print:

h1Section 1/h1
pElements of type string should be coloured blue/p
pContributed by: Matt Goodall [EMAIL PROTECTED]/p

The '' and '' surrounding my email address breaks the HTML.

To fix that you need to escape results['user'] with cgi.escape or
xml.sax.saxutils.escape. Oh, and don't forget to escape anything
destined for an HTML attribute differently, see sax.saxutils.quoteattr.

A triple-quoted string is beautifully simple but it's not quite as much
a friend as it might initially seem. ;-)

I don't intend to get into a XML- vs text- based templating flame war
;-) but, IMHO, the solution is to use a templating language that
understands where the value is used in the template.

Kid is a great example of an XML-based templating language but there are
many others. Some have probably been mentioned in this thread already.

Another interesting solutions is to use something like Nevow's tags module:

from nevow import flat, tags as T

results = {'secnum': 1, 'type': 'string', 'color': 'blue',
'user':'Matt Goodall [EMAIL PROTECTED]'}

doc = T.div[
T.h1['Section ', results['secnum']],
T.p['Elements of type ', results['type'],
' should be coloured ', results['color']],
T.p['Contributed by: ', results['user']],
]

print flat.flatten(doc)

This time you get valid HTML with no effort whatsoever:

divh1Section 1/h1pElements of type string should be coloured
blue/ppContributed by: Matt Goodall
lt;[EMAIL PROTECTED]gt;/p/div

You even get to write HTML in a slightly more Pythonic way (even if it
does abuse Python just a little :wink:), but Nevow will happily load a
template containing actual XHTML from disk if you prefer.

The only real problem using Nevow for this is that you will need to
install Twisted too. I suspect you'll find a couple of Nevow tag
implementations that don't need Twisted if you ask Google.

Anyway! This was just to demonstrate an alternate approach than to
evangelise about Nevow. I hope it was at least interesting. :)

Cheers, Matt

Nevow:
http://divmod.org/trac/wiki/DivmodNevow
Twisted:
http://twistedmatrix.com/trac

-- 
 __
/  \__ Matt Goodall, Pollenation Internet Ltd
\__/  \w: http://www.pollenation.net
  __/  \__/e: [EMAIL PROTECTED]
 /  \__/  \t: +44 (0)113 2252500
 \__/  \__/
 /  \  Any views expressed are my own and do not necessarily
 \__/  reflect the views of my employer.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Twisted book opinions?

2006-02-09 Thread Matt Goodall
Jay Parlar wrote:
 I was hoping to get some c.l.p. opinions on O'Reilly's new Twisted book.

I think it's a good book to get. I know a fair amount about Twisted but
it still made for interesting reading.

Tommi Virtanen (aka tv) posted a great review of the book shortly after
it was published. http://tv.debian.net/articles/review-snakeball/.

- Matt
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: psycopg2

2006-02-02 Thread Matt Goodall
Jane Goldman wrote:
 Hello,
 
 I bigginer Python programmer. I am working on web application that
 access PostgreSQL on backend. After I imported PSYCOPG2 module in my
 program I started to get unwanded debug output into my web bowser. It is
 something like that:
 
 initpsycopg: initializing psycopg 2.0b6.2 (dec dt ext pq3)
 typecast_init: initializing NUMBER typecast_new: new type at = 00962920,
 refcnt = 1 typecast_new: typecast object created at 00962920
 typecast_add: object at 00962920, values refcnt = 2 typecast_add: adding
 val: 20 typecast_add: adding val: 23 typecast_add: adding val: 21
 typecast_add: adding val: 701 typecast_add: adding val: 700
 typecast_add: adding val: 1700 typecast_add: base caster: 
 typecast_init: initializing LONGINTEGER typecast_new: new type at =
 00962960, refcnt = 1 typecast_new: typecast object created at 00962960
 typecast_add: object at 00962960, values refcnt = 2 typecast_add:
 
 and so on ...
 
 I use Cheetah template to generate HTML code. I run Active python 2.4 in
 Windows XP enviroment.
 
 Any help how to stop get this garbage in my web browser would be highly
 appreciated.

It looks you installed a version that has the PSYCOPG_DEBUG flag turned
on. That flag causes psycopg2 to emit gobs of useless information ;-).

I would recommend downloading the latest beta release from
http://initd.org/. Alternatively you can remove the PSYCOPG_DEBUG from
setup.cfg in the version you already have and reinstall.

Hope this helps.

- Matt

-- 
 __
/  \__ Matt Goodall, Pollenation Internet Ltd
\__/  \w: http://www.pollenation.net
  __/  \__/e: [EMAIL PROTECTED]
 /  \__/  \t: +44 (0)113 2252500
 \__/  \__/
 /  \  Any views expressed are my own and do not necessarily
 \__/  reflect the views of my employer.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: DB API and thread safety

2006-01-20 Thread Matt Goodall
Robin Haswell wrote:
 Hey guys
 
 I've been reading http://www.python.org/peps/pep-0249.html and I don't
 quite get what level of thread safety I need for my DB connections.
 
 If I call db = FOOdb::connect() at the start of my app, and then every
 thread does it's own c = db.cursor() at the top, what level of thread
 safety do I need to avoid threads stepping on each other? Hopefully the
 answer to this question will get me oriented enough to understand the
 other options :-)

Assuming the cursor created in the thread is never accessed by another
thread then you need a dbapi module that supports threadsafety level 2 -
threads may share the module and connections.

- Matt

-- 
 __
/  \__ Matt Goodall, Pollenation Internet Ltd
\__/  \w: http://www.pollenation.net
  __/  \__/e: [EMAIL PROTECTED]
 /  \__/  \t: +44 (0)113 2252500
 \__/  \__/
 /  \  Any views expressed are my own and do not necessarily
 \__/  reflect the views of my employer.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: recommended way of generating HTML from Python

2005-02-21 Thread Matt Goodall
On Mon, 2005-02-21 at 07:36 -0500, Kent Johnson wrote:
 Michele Simionato wrote:
  The problem is a problem of standardization, indeed. There plenty of
  recipes to
  do the same job, I just would like to use a blessed one (I am teaching
  a Python
  course and I do not know what to recommend to my students).
 
 Why not teach your students to use a template system?
 
  FWIW, here is a my version of the recipe (stripped down to the bare
  essentials)
  
  .def makeattr(dict_or_list_of_pairs):
  .dic = dict(dict_or_list_of_pairs)
  .return  .join(%s=%r % (k, dic[k]) for k in dic)
  
  .class HTMLTag(object):
  .def __getattr__(self, name):
  .def tag(value, **attr):
  .value can be a string or a sequence of strings.
  .if hasattr(value, __iter__): # is iterable
  .value =  .join(value)
  .return %s %s%s/%s\n % (name, makeattr(attr), value,
  name)
  .return tag
  
  # example:
  .html = HTMLTag()
  
  .tableheader = [field1, field2]
  .tablebody = [[a1, a2],
  . [b1, b2]]
  
  .html_header = [html.tr(html.th(el) for el in tableheader)]
  .html_table = [html.tr(html.td(el) for el in row) for row in tablebody]
  .print html.table(html_header + html_table)
 
 *Shudder*
 
 I've written web pages this way (using a pretty nice Java HTML generation 
 package) and I don't 
 recommend it. In my experience, this approach has several drawbacks:
 - as soon as the web page gets at all complex, the conceptual shift from HTML 
 to code and back is 
 difficult.
 - It is hard to work with a designer. The designer will give you sample web 
 pages which then have to 
 be hand-translated to code. Changes to the web page have to be located in the 
 code.
 - There is no separation of content and presentation

Slightly off topic but, just to be clear, Nevow supports XHTML templates
*and* stan tag expressions.  Both are useful. As a general rule, I stick
to XHTML templates but I use stan for prototyping pages and when marking
up the XHTML templates gets so complicated that they might as well be
written in Python anyway.

Also, just because the HTML code is not in a .html file does not
necessarily mean that content and presentation are mixed up. For
instance, with stan (and probably the alternatives mentioned in this
thread) it's very easy to build a tag library that the real Python
code simply calls on to render page content.

 
 IMO templating systems are a much better solution. They let you express HTML 
 in HTML directly; you 
 communicate with a designer in a language the designer understands; you can 
 separate content and 
 presentation.

Agreed. Although I would go further and say that it's important to
choose a templating system that allows the Python developer to annotate
XHTML templates using **valid XML**, i.e. no for x in y loops, no if
foo conditionals, no i = 0 variable setting, no expression
evaluations, etc.


advocacy
The lovely thing about Nevow is that it encourages good separation -
HTML is HTML and logic is Python, as it should be - but does not get in
the way when breaking the rules is necessary or just a lot easier.
/advocacy


Cheers, Matt

-- 
 __
/  \__ Matt Goodall, Pollenation Internet Ltd
\__/  \w: http://www.pollenation.net
  __/  \__/e: [EMAIL PROTECTED]
 /  \__/  \t: +44 (0)113 2252500
 \__/  \__/
 /  \  Any views expressed are my own and do not necessarily
 \__/  reflect the views of my employer.

-- 
http://mail.python.org/mailman/listinfo/python-list