Re: why isn't Unicode the default encoding?
John Salerno wrote: Martin v. Löwis wrote: The real problem is that the Python string type is used to represent two very different concepts: bytes, and characters. You can't just drop the current Python string type, and use the Unicode type instead - then you would have no good way to represent sequences of bytes anymore. Byte sequences occur more often than you might think: a ZIP file, a MS Word file, a PDF file, and even an HTTP conversation are represented through byte sequences. So for a byte sequence, internal representation is important; for a character string, it is not. Now, for historical reasons, the Python string literals create byte strings, not character strings. Since we cannot know whether a certain string literal is meant to denote bytes or characters, we can't just change the interpretation. Interesting. So then the read() method, if given a numeric argument for bytes to read, would act differently depending on if you were using Unicode or not? As it is now, it seems to equate the bytes with number of characters, but if the document was written using Unicode characters, is it possible that read(2) might only pull out one character? Exactly. read(2) might pull out one character, or only half a character. It all depends on the encoding of the data you're reading. If you're reading or writing text to a file (or anywhere, for that matter) you need to know the unicode encoding of the file's content to read it correctly. Fortunately, the codecs module makes the whole process relatively painless: import codecs f = open(a_utf8_encoded_file.txt) stream = codecs.getreader('utf-8')(f) c = stream.read(1) The 'stream' works on unicode characters so 'c' is a unicode instance, i.e. a whole textual character. - Matt -- __ / \__ Matt Goodall, Pollenation Internet Ltd \__/ \w: http://www.pollenation.net __/ \__/e: [EMAIL PROTECTED] / \__/ \t: +44 (0)113 2252500 \__/ \__/ / \ Any views expressed are my own and do not necessarily \__/ reflect the views of my employer. -- http://mail.python.org/mailman/listinfo/python-list
Re: Any python HTML generator libs?
Steve Holden wrote: Sullivan WxPyQtKinter wrote: Hi, everyone. Simply put, what I need most now is a python lib to generate simple HTML. I am now using XML to store my lab report records. I found python really convinient to manipulate XML, so I want to make a small on-line CGI program to help my colleagues to build their lab report records into XML, for storage, HTML display (for others to browse) and search. With python's standard lib, the DOM object could realize the XML storage and search quite easily, but for HTML generation, it is a great headache. I tried to turn to the in-line server-side python script PSP(something like asp and php) instead of CGI. However, since the report data is always of very complex data structures, it is really hard to write most things in-line. For example, a PCR reaction is expected to be shown in this format or alike on a web browser: PCR Sample: Sm1032 Operater: SullivanZ TimeStamp: hh:mm mm-dd- Reaction: Reagent1: Name: Concentration: mM Volumn:XXX uL Reagent2: Since there are hundreds of PCR reaction and other operations in the lab report, in-line PSP is not a suitable solution. But writing HTML directly with print statement is a great pain. Will XSTL be useful? Is my problem somewho related with XML-SIG? Looking forward to your precious suggestion. The triple-quoted string with string substitution is your friend. Try writing something(s) like: results = {'secnum': 1, 'type': 'string', 'color': 'blue'} print \ h1Section %(secnum)s/h1 pElements of type %(type)s should be coloured %(color)s/p % results Don't forget that you may need to escape the application's data for inclusion in HTML: results = {'secnum': 1, 'type': 'string', 'color': 'blue', 'user':'Matt Goodall [EMAIL PROTECTED]'} print \ h1Section %(secnum)s/h1 pElements of type %(type)s should be coloured %(color)s/p pContributed by: %(user)s/p % results Will print: h1Section 1/h1 pElements of type string should be coloured blue/p pContributed by: Matt Goodall [EMAIL PROTECTED]/p The '' and '' surrounding my email address breaks the HTML. To fix that you need to escape results['user'] with cgi.escape or xml.sax.saxutils.escape. Oh, and don't forget to escape anything destined for an HTML attribute differently, see sax.saxutils.quoteattr. A triple-quoted string is beautifully simple but it's not quite as much a friend as it might initially seem. ;-) I don't intend to get into a XML- vs text- based templating flame war ;-) but, IMHO, the solution is to use a templating language that understands where the value is used in the template. Kid is a great example of an XML-based templating language but there are many others. Some have probably been mentioned in this thread already. Another interesting solutions is to use something like Nevow's tags module: from nevow import flat, tags as T results = {'secnum': 1, 'type': 'string', 'color': 'blue', 'user':'Matt Goodall [EMAIL PROTECTED]'} doc = T.div[ T.h1['Section ', results['secnum']], T.p['Elements of type ', results['type'], ' should be coloured ', results['color']], T.p['Contributed by: ', results['user']], ] print flat.flatten(doc) This time you get valid HTML with no effort whatsoever: divh1Section 1/h1pElements of type string should be coloured blue/ppContributed by: Matt Goodall lt;[EMAIL PROTECTED]gt;/p/div You even get to write HTML in a slightly more Pythonic way (even if it does abuse Python just a little :wink:), but Nevow will happily load a template containing actual XHTML from disk if you prefer. The only real problem using Nevow for this is that you will need to install Twisted too. I suspect you'll find a couple of Nevow tag implementations that don't need Twisted if you ask Google. Anyway! This was just to demonstrate an alternate approach than to evangelise about Nevow. I hope it was at least interesting. :) Cheers, Matt Nevow: http://divmod.org/trac/wiki/DivmodNevow Twisted: http://twistedmatrix.com/trac -- __ / \__ Matt Goodall, Pollenation Internet Ltd \__/ \w: http://www.pollenation.net __/ \__/e: [EMAIL PROTECTED] / \__/ \t: +44 (0)113 2252500 \__/ \__/ / \ Any views expressed are my own and do not necessarily \__/ reflect the views of my employer. -- http://mail.python.org/mailman/listinfo/python-list
Re: Twisted book opinions?
Jay Parlar wrote: I was hoping to get some c.l.p. opinions on O'Reilly's new Twisted book. I think it's a good book to get. I know a fair amount about Twisted but it still made for interesting reading. Tommi Virtanen (aka tv) posted a great review of the book shortly after it was published. http://tv.debian.net/articles/review-snakeball/. - Matt -- http://mail.python.org/mailman/listinfo/python-list
Re: psycopg2
Jane Goldman wrote: Hello, I bigginer Python programmer. I am working on web application that access PostgreSQL on backend. After I imported PSYCOPG2 module in my program I started to get unwanded debug output into my web bowser. It is something like that: initpsycopg: initializing psycopg 2.0b6.2 (dec dt ext pq3) typecast_init: initializing NUMBER typecast_new: new type at = 00962920, refcnt = 1 typecast_new: typecast object created at 00962920 typecast_add: object at 00962920, values refcnt = 2 typecast_add: adding val: 20 typecast_add: adding val: 23 typecast_add: adding val: 21 typecast_add: adding val: 701 typecast_add: adding val: 700 typecast_add: adding val: 1700 typecast_add: base caster: typecast_init: initializing LONGINTEGER typecast_new: new type at = 00962960, refcnt = 1 typecast_new: typecast object created at 00962960 typecast_add: object at 00962960, values refcnt = 2 typecast_add: and so on ... I use Cheetah template to generate HTML code. I run Active python 2.4 in Windows XP enviroment. Any help how to stop get this garbage in my web browser would be highly appreciated. It looks you installed a version that has the PSYCOPG_DEBUG flag turned on. That flag causes psycopg2 to emit gobs of useless information ;-). I would recommend downloading the latest beta release from http://initd.org/. Alternatively you can remove the PSYCOPG_DEBUG from setup.cfg in the version you already have and reinstall. Hope this helps. - Matt -- __ / \__ Matt Goodall, Pollenation Internet Ltd \__/ \w: http://www.pollenation.net __/ \__/e: [EMAIL PROTECTED] / \__/ \t: +44 (0)113 2252500 \__/ \__/ / \ Any views expressed are my own and do not necessarily \__/ reflect the views of my employer. -- http://mail.python.org/mailman/listinfo/python-list
Re: DB API and thread safety
Robin Haswell wrote: Hey guys I've been reading http://www.python.org/peps/pep-0249.html and I don't quite get what level of thread safety I need for my DB connections. If I call db = FOOdb::connect() at the start of my app, and then every thread does it's own c = db.cursor() at the top, what level of thread safety do I need to avoid threads stepping on each other? Hopefully the answer to this question will get me oriented enough to understand the other options :-) Assuming the cursor created in the thread is never accessed by another thread then you need a dbapi module that supports threadsafety level 2 - threads may share the module and connections. - Matt -- __ / \__ Matt Goodall, Pollenation Internet Ltd \__/ \w: http://www.pollenation.net __/ \__/e: [EMAIL PROTECTED] / \__/ \t: +44 (0)113 2252500 \__/ \__/ / \ Any views expressed are my own and do not necessarily \__/ reflect the views of my employer. -- http://mail.python.org/mailman/listinfo/python-list
Re: recommended way of generating HTML from Python
On Mon, 2005-02-21 at 07:36 -0500, Kent Johnson wrote: Michele Simionato wrote: The problem is a problem of standardization, indeed. There plenty of recipes to do the same job, I just would like to use a blessed one (I am teaching a Python course and I do not know what to recommend to my students). Why not teach your students to use a template system? FWIW, here is a my version of the recipe (stripped down to the bare essentials) .def makeattr(dict_or_list_of_pairs): .dic = dict(dict_or_list_of_pairs) .return .join(%s=%r % (k, dic[k]) for k in dic) .class HTMLTag(object): .def __getattr__(self, name): .def tag(value, **attr): .value can be a string or a sequence of strings. .if hasattr(value, __iter__): # is iterable .value = .join(value) .return %s %s%s/%s\n % (name, makeattr(attr), value, name) .return tag # example: .html = HTMLTag() .tableheader = [field1, field2] .tablebody = [[a1, a2], . [b1, b2]] .html_header = [html.tr(html.th(el) for el in tableheader)] .html_table = [html.tr(html.td(el) for el in row) for row in tablebody] .print html.table(html_header + html_table) *Shudder* I've written web pages this way (using a pretty nice Java HTML generation package) and I don't recommend it. In my experience, this approach has several drawbacks: - as soon as the web page gets at all complex, the conceptual shift from HTML to code and back is difficult. - It is hard to work with a designer. The designer will give you sample web pages which then have to be hand-translated to code. Changes to the web page have to be located in the code. - There is no separation of content and presentation Slightly off topic but, just to be clear, Nevow supports XHTML templates *and* stan tag expressions. Both are useful. As a general rule, I stick to XHTML templates but I use stan for prototyping pages and when marking up the XHTML templates gets so complicated that they might as well be written in Python anyway. Also, just because the HTML code is not in a .html file does not necessarily mean that content and presentation are mixed up. For instance, with stan (and probably the alternatives mentioned in this thread) it's very easy to build a tag library that the real Python code simply calls on to render page content. IMO templating systems are a much better solution. They let you express HTML in HTML directly; you communicate with a designer in a language the designer understands; you can separate content and presentation. Agreed. Although I would go further and say that it's important to choose a templating system that allows the Python developer to annotate XHTML templates using **valid XML**, i.e. no for x in y loops, no if foo conditionals, no i = 0 variable setting, no expression evaluations, etc. advocacy The lovely thing about Nevow is that it encourages good separation - HTML is HTML and logic is Python, as it should be - but does not get in the way when breaking the rules is necessary or just a lot easier. /advocacy Cheers, Matt -- __ / \__ Matt Goodall, Pollenation Internet Ltd \__/ \w: http://www.pollenation.net __/ \__/e: [EMAIL PROTECTED] / \__/ \t: +44 (0)113 2252500 \__/ \__/ / \ Any views expressed are my own and do not necessarily \__/ reflect the views of my employer. -- http://mail.python.org/mailman/listinfo/python-list