Re: how to avoid leading white spaces

2011-06-08 Thread rusi
On Jun 7, 11:37 pm, ru...@yahoo.com ru...@yahoo.com wrote: On 06/06/2011 08:33 AM, rusi wrote: For any significant language feature (take recursion for example) there are these issues: 1. Ease of reading/skimming (other's) code 2. Ease of writing/designing one's own 3. Learning curve

Re: how to avoid leading white spaces

2011-06-08 Thread Duncan Booth
ru...@yahoo.com ru...@yahoo.com wrote: On 06/06/2011 09:29 AM, Steven D'Aprano wrote: Yes, but you have to pay the cost of loading the re engine, even if it is a one off cost, it's still a cost, ~$ time python -c 'pass' real 0m0.015s user 0m0.011s sys 0m0.003s ~$ time

Re: how to avoid leading white spaces

2011-06-08 Thread ru...@yahoo.com
On 06/08/2011 03:01 AM, Duncan Booth wrote: ru...@yahoo.com ru...@yahoo.com wrote: On 06/06/2011 09:29 AM, Steven D'Aprano wrote: Yes, but you have to pay the cost of loading the re engine, even if it is a one off cost, it's still a cost, [...] At least part of the reason that there's no

Re: how to avoid leading white spaces

2011-06-08 Thread ru...@yahoo.com
On 06/07/2011 06:30 PM, Roy Smith wrote: On 06/06/2011 08:33 AM, rusi wrote: Evidently for syntactic, implementation and cultural reasons, Perl programmers are likely to get (and then overuse) regexes faster than python programmers. ru...@yahoo.com ru...@yahoo.com wrote: I don't see how the

Re: how to avoid leading white spaces

2011-06-08 Thread rusi
On Jun 8, 7:38 pm, ru...@yahoo.com ru...@yahoo.com wrote: On 06/07/2011 06:30 PM, Roy Smith wrote: On 06/06/2011 08:33 AM, rusi wrote: Evidently for syntactic, implementation and cultural reasons, Perl programmers are likely to get (and then overuse) regexes faster than python

Re: how to avoid leading white spaces

2011-06-08 Thread Chris Torek
On 03/06/2011 03:58, Chris Torek wrote: - This is a bit surprising, since both s1 in s2 and re.search() could use a Boyer-Moore-based algorithm for a sufficiently-long fixed string, and the time required should be proportional to that needed to

Re: how to avoid leading white spaces

2011-06-07 Thread ru...@yahoo.com
On 06/06/2011 09:29 AM, Steven D'Aprano wrote: On Sun, 05 Jun 2011 23:03:39 -0700, ru...@yahoo.com wrote: [...] I would argue that the first, non-regex solution is superior, as it clearly distinguishes the multiple steps of the solution: * filter lines that start with CUSTOMER * extract

Re: how to avoid leading white spaces

2011-06-07 Thread ru...@yahoo.com
On 06/06/2011 08:33 AM, rusi wrote: For any significant language feature (take recursion for example) there are these issues: 1. Ease of reading/skimming (other's) code 2. Ease of writing/designing one's own 3. Learning curve 4. Costs/payoffs (eg efficiency, succinctness) of use 5.

Re: how to avoid leading white spaces

2011-06-07 Thread Roy Smith
On 06/06/2011 08:33 AM, rusi wrote: Evidently for syntactic, implementation and cultural reasons, Perl programmers are likely to get (and then overuse) regexes faster than python programmers. ru...@yahoo.com ru...@yahoo.com wrote: I don't see how the different Perl and Python cultures

Re: how to avoid leading white spaces

2011-06-06 Thread ru...@yahoo.com
On 06/03/2011 08:05 PM, Steven D'Aprano wrote: On Fri, 03 Jun 2011 12:29:52 -0700, ru...@yahoo.com wrote: I often find myself changing, for example, a startwith() to a RE when I realize that the input can contain mixed case Why wouldn't you just normalise the case? Because some of the text

Re: how to avoid leading white spaces

2011-06-06 Thread Chris Torek
In article ef48ad50-da06-47a8-978a-47d6f4271...@d28g2000yqf.googlegroups.com ru...@yahoo.com ru...@yahoo.com wrote (in part): [mass snippage] What I mean is that I see regexes as being an extremely small, highly restricted, domain specific language targeted specifically at describing text

Re: how to avoid leading white spaces

2011-06-06 Thread Octavian Rasnita
: comp.lang.python To: python-list@python.org Sent: Monday, June 06, 2011 10:11 AM Subject: Re: how to avoid leading white spaces In article ef48ad50-da06-47a8-978a-47d6f4271...@d28g2000yqf.googlegroups.com ru...@yahoo.com ru...@yahoo.com wrote (in part): [mass snippage] What I mean is that I see regexes

Re: how to avoid leading white spaces

2011-06-06 Thread Chris Angelico
On Mon, Jun 6, 2011 at 6:51 PM, Octavian Rasnita orasn...@gmail.com wrote: It is not so hard to decide whether using RE is a good thing or not. When the speed is important and every millisecond counts, RE should be used only when there is no other faster way, because usually RE is less faster

Re: how to avoid leading white spaces

2011-06-06 Thread rusi
For any significant language feature (take recursion for example) there are these issues: 1. Ease of reading/skimming (other's) code 2. Ease of writing/designing one's own 3. Learning curve 4. Costs/payoffs (eg efficiency, succinctness) of use 5. Debug-ability I'll start with 3. When someone of

Re: how to avoid leading white spaces

2011-06-06 Thread Steven D'Aprano
On Sun, 05 Jun 2011 23:03:39 -0700, ru...@yahoo.com wrote: Thus what starts as if line.startswith ('CUSTOMER '): try: kw, first_initial, last_name, code, rest = line.split(None, 4) ... often turns into (sometimes before it is written) something like m = re.match

Re: how to avoid leading white spaces

2011-06-06 Thread Ian Kelly
On Mon, Jun 6, 2011 at 9:29 AM, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: [...] I would expect any regex processor to compile the regex into an FSM. Flying Spaghetti Monster? I have been Touched by His Noodly Appendage!!! Finite State Machine. --

Re: how to avoid leading white spaces

2011-06-06 Thread Neil Cerutti
On 2011-06-06, ru...@yahoo.com ru...@yahoo.com wrote: On 06/03/2011 02:49 PM, Neil Cerutti wrote: Can you find an example or invent one? I simply don't remember such problems coming up, but I admit it's possible. Sure, the response to the OP of this thread. Here's a recap, along with two

Re: how to avoid leading white spaces

2011-06-06 Thread Ian Kelly
On Mon, Jun 6, 2011 at 10:08 AM, Neil Cerutti ne...@norwich.edu wrote: import re print(re solution) with open(data.txt) as f:    for line in f:        fixed = re.sub(r(TABLE='\S+)\s+', r\1', line)        print(fixed, end='') print(non-re solution) with open(data.txt) as f:    for line

Re: how to avoid leading white spaces

2011-06-06 Thread Neil Cerutti
On 2011-06-06, Ian Kelly ian.g.ke...@gmail.com wrote: On Mon, Jun 6, 2011 at 10:08 AM, Neil Cerutti ne...@norwich.edu wrote: import re print(re solution) with open(data.txt) as f: ? ?for line in f: ? ? ? ?fixed = re.sub(r(TABLE='\S+)\s+', r\1', line) ? ? ? ?print(fixed, end='')

Re: how to avoid leading white spaces

2011-06-06 Thread Ethan Furman
Ian Kelly wrote: On Mon, Jun 6, 2011 at 10:08 AM, Neil Cerutti ne...@norwich.edu wrote: import re print(re solution) with open(data.txt) as f: for line in f: fixed = re.sub(r(TABLE='\S+)\s+', r\1', line) print(fixed, end='') print(non-re solution) with open(data.txt) as f:

Re: how to avoid leading white spaces

2011-06-06 Thread Ian Kelly
On Mon, Jun 6, 2011 at 11:17 AM, Neil Cerutti ne...@norwich.edu wrote: I wrestled with using addition like that, and decided against it. The 7 is a magic number and repeats/hides information. I wanted something like:   prefix = TABLE='   start = line.index(prefix) + len(prefix) But decided

Re: how to avoid leading white spaces

2011-06-06 Thread Ian Kelly
On Mon, Jun 6, 2011 at 11:48 AM, Ethan Furman et...@stoneleaf.us wrote: I like the readability of this version, but isn't generating an exception on every other line going to kill performance? I timed it on the example data before I posted and found that it was still 10 times as fast as the

Re: how to avoid leading white spaces

2011-06-06 Thread Neil Cerutti
On 2011-06-06, Ian Kelly ian.g.ke...@gmail.com wrote: Fair enough, although if you ask me the + 1 is just as magical as the + 7 (it's still the length of the string that you're searching for). Also, re-finding the opening ' still repeats information. Heh, true. I doesn't really repeat

Re: how to avoid leading white spaces

2011-06-06 Thread Ian
On 03/06/2011 03:58, Chris Torek wrote: - This is a bit surprising, since both s1 in s2 and re.search() could use a Boyer-Moore-based algorithm for a sufficiently-long fixed string, and the time required should be proportional to that needed to

Re: how to avoid leading white spaces

2011-06-05 Thread rusi
On Jun 3, 7:25 pm, Steven D'Aprano steve +comp.lang.pyt...@pearwood.info wrote: Regarding their syntax, I'd like to point out that even Larry Wall is dissatisfied with regex culture in the Perl community: http://www.perl.com/pub/2002/06/04/apo5.html This is a very good link. And it can be a

Re: how to avoid leading white spaces

2011-06-05 Thread ru...@yahoo.com
On 06/03/2011 02:49 PM, Neil Cerutti wrote: On 2011-06-03, ru...@yahoo.com ru...@yahoo.com wrote: or that I have to treat commas as well as spaces as delimiters. source.replace(,, ).split( ) Uhgg. create a whole new string just so you can split it on one rather than two

Re: how to avoid leading white spaces

2011-06-05 Thread ru...@yahoo.com
On 06/03/2011 03:45 PM, Chris Torek wrote: On 2011-06-03, ru...@yahoo.com ru...@yahoo.com wrote: [prefers] re.split ('[ ,]', source) This is probably not what you want in dealing with human-created text: re.split('[ ,]', 'foo bar, spam,maps') ['foo', '', 'bar', '', 'spam',

Re: how to avoid leading white spaces

2011-06-04 Thread Chris Angelico
On Sat, Jun 4, 2011 at 12:30 PM, Roy Smith r...@panix.com wrote: Another nice thing about regexes (as compared to string methods) is that they're both portable and serializable.  You can use the same regex in Perl, Python, Ruby, PHP, etc.  You can transmit them over a network connection to a

Re: how to avoid leading white spaces

2011-06-04 Thread Roy Smith
I wrote: Another nice thing about regexes (as compared to string methods) is that they're both portable and serializable. You can use the same regex in Perl, Python, Ruby, PHP, etc. In article 4de9bf50$0$29996$c3e8da3$54964...@news.astraweb.com, Steven D'Aprano

Re: how to avoid leading white spaces

2011-06-04 Thread rusi
The efficiently argument is specious. [This is a python list not a C or assembly list] The real issue is that complex regexes are hard to get right -- even if one is experienced. This is analogous to the fact that knotty programs can be hard to get right even for experienced programmers. The

Re: how to avoid leading white spaces

2011-06-04 Thread Nobody
On Sat, 04 Jun 2011 13:41:33 +1200, Gregory Ewing wrote: Python might be penalized by its use of Unicode here, since a Boyer-Moore table for a full 16-bit Unicode string would need 65536 entries But is there any need for the Boyer-Moore algorithm to operate on characters? Seems to me

Re: how to avoid leading white spaces

2011-06-04 Thread Nobody
On Sat, 04 Jun 2011 05:14:56 +, Steven D'Aprano wrote: This fails to support non-ASCII letters, and you know quite well that having to spell out by hand regexes in both upper and lower (or mixed) case is not support for case-insensitive matching. That's why Python's re has a case

Re: how to avoid leading white spaces

2011-06-04 Thread Steven D'Aprano
On Sat, 04 Jun 2011 09:39:24 -0400, Roy Smith wrote: To be sure, if you explore the edges of the regex syntax space, you can write non-portable expressions. You don't even have to get very far out to the edge. But, as you say, if you limit yourself to a subset, you can write portable ones.

Re: how to avoid leading white spaces

2011-06-04 Thread Steven D'Aprano
On Sat, 04 Jun 2011 21:02:32 +0100, Nobody wrote: On Sat, 04 Jun 2011 05:14:56 +, Steven D'Aprano wrote: This fails to support non-ASCII letters, and you know quite well that having to spell out by hand regexes in both upper and lower (or mixed) case is not support for case-insensitive

Re: how to avoid leading white spaces

2011-06-03 Thread Thorsten Kampe
* Roy Smith (Thu, 02 Jun 2011 21:57:16 -0400) In article 94ph22frh...@mid.individual.net, Neil Cerutti ne...@norwich.edu wrote: On 2011-06-01, ru...@yahoo.com ru...@yahoo.com wrote: For some odd reason (perhaps because they are used a lot in Perl), this groups seems to have a great

Re: how to avoid leading white spaces

2011-06-03 Thread ru...@yahoo.com
On 06/02/2011 07:21 AM, Neil Cerutti wrote: On 2011-06-01, ru...@yahoo.com ru...@yahoo.com wrote: For some odd reason (perhaps because they are used a lot in Perl), this groups seems to have a great aversion to regular expressions. Too bad because this is a typical problem where their

Re: how to avoid leading white spaces

2011-06-03 Thread Nobody
On Fri, 03 Jun 2011 04:30:46 +, Chris Torek wrote: I'm not sure what you mean by full 16-bit Unicode string? Isn't unicode inherently 32 bit? Well, not exactly. As I understand it, Python is normally built with a 16-bit unicode character type though It's normally 32-bit on platforms

Re: how to avoid leading white spaces

2011-06-03 Thread Neil Cerutti
On 2011-06-03, ru...@yahoo.com ru...@yahoo.com wrote: The other tradeoff, applying both to Perl and Python is with maintenance. As mentioned above, even when today's requirements can be solved with some code involving several string functions, indexes, and conditionals, when those

Re: how to avoid leading white spaces

2011-06-03 Thread Nobody
On Fri, 03 Jun 2011 02:58:24 +, Chris Torek wrote: Python might be penalized by its use of Unicode here, since a Boyer-Moore table for a full 16-bit Unicode string would need 65536 entries (one per possible ord() value). However, if the string being sought is all single-byte values, a

Re: how to avoid leading white spaces

2011-06-03 Thread Steven D'Aprano
On Fri, 03 Jun 2011 05:51:18 -0700, ru...@yahoo.com wrote: On 06/02/2011 07:21 AM, Neil Cerutti wrote: Python's str methods, when they're sufficent, are usually more efficient. Unfortunately, except for the very simplest cases, they are often not sufficient. Maybe so, but the very

Re: how to avoid leading white spaces

2011-06-03 Thread D'Arcy J.M. Cain
On 03 Jun 2011 14:25:53 GMT Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: source.replace(,, ).split( ) I would do; source.replace(,, ).split() [steve@sylar ~]$ python -m timeit -s source = 'a b c,d,e,f,g h i j k' What if the string is 'a b c, d, e,f,g h i j k'?

Re: how to avoid leading white spaces

2011-06-03 Thread ru...@yahoo.com
On 06/03/2011 07:17 AM, Neil Cerutti wrote: On 2011-06-03, ru...@yahoo.com ru...@yahoo.com wrote: The other tradeoff, applying both to Perl and Python is with maintenance. As mentioned above, even when today's requirements can be solved with some code involving several string functions,

Re: how to avoid leading white spaces

2011-06-03 Thread ru...@yahoo.com
On 06/03/2011 08:25 AM, Steven D'Aprano wrote: On Fri, 03 Jun 2011 05:51:18 -0700, ru...@yahoo.com wrote: On 06/02/2011 07:21 AM, Neil Cerutti wrote: Python's str methods, when they're sufficent, are usually more efficient. Unfortunately, except for the very simplest cases, they are

Re: how to avoid leading white spaces

2011-06-03 Thread Neil Cerutti
On 2011-06-03, ru...@yahoo.com ru...@yahoo.com wrote: or that I have to treat commas as well as spaces as delimiters. source.replace(,, ).split( ) Uhgg. create a whole new string just so you can split it on one rather than two characters? Sorry, but I find re.split ('[ ,]', source)

Re: how to avoid leading white spaces

2011-06-03 Thread Chris Torek
On 2011-06-03, ru...@yahoo.com ru...@yahoo.com wrote: [prefers] re.split ('[ ,]', source) This is probably not what you want in dealing with human-created text: re.split('[ ,]', 'foo bar, spam,maps') ['foo', '', 'bar', '', 'spam', 'maps'] Instead, you probably want a comma

Re: how to avoid leading white spaces

2011-06-03 Thread Ethan Furman
Chris Torek wrote: On 2011-06-03, ru...@yahoo.com ru...@yahoo.com wrote: [prefers] re.split ('[ ,]', source) This is probably not what you want in dealing with human-created text: re.split('[ ,]', 'foo bar, spam,maps') ['foo', '', 'bar', '', 'spam', 'maps'] I think you've got

Re: how to avoid leading white spaces

2011-06-03 Thread MRAB
On 03/06/2011 23:11, Ethan Furman wrote: Chris Torek wrote: On 2011-06-03, ru...@yahoo.com ru...@yahoo.com wrote: [prefers] re.split ('[ ,]', source) This is probably not what you want in dealing with human-created text: re.split('[ ,]', 'foo bar, spam,maps') ['foo', '', 'bar', '',

Re: how to avoid leading white spaces

2011-06-03 Thread Gregory Ewing
Chris Torek wrote: Python might be penalized by its use of Unicode here, since a Boyer-Moore table for a full 16-bit Unicode string would need 65536 entries But is there any need for the Boyer-Moore algorithm to operate on characters? Seems to me you could just as well chop the UTF-16 up into

Re: how to avoid leading white spaces

2011-06-03 Thread Steven D'Aprano
On Fri, 03 Jun 2011 12:29:52 -0700, ru...@yahoo.com wrote: I often find myself changing, for example, a startwith() to a RE when I realize that the input can contain mixed case Why wouldn't you just normalise the case? Because some of the text may be case-sensitive. Perhaps you

Re: how to avoid leading white spaces

2011-06-03 Thread MRAB
On 04/06/2011 03:05, Steven D'Aprano wrote: On Fri, 03 Jun 2011 12:29:52 -0700, ru...@yahoo.com wrote: I often find myself changing, for example, a startwith() to a RE when I realize that the input can contain mixed case Why wouldn't you just normalise the case? Because some of the text

Re: how to avoid leading white spaces

2011-06-03 Thread Roy Smith
In article 4de992d7$0$29996$c3e8da3$54964...@news.astraweb.com, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: Of course, if you include both case-sensitive and insensitive tests in the same calculation, that's a good candidate for a regex... or at least it would be if regexes

Re: how to avoid leading white spaces

2011-06-03 Thread Steven D'Aprano
On Sat, 04 Jun 2011 03:24:50 +0100, MRAB wrote: [snip] Some regex implementations support scoped case sensitivity. :-) Yes, you should link to your regex library :) Have you considered the suggested Perl 6 syntax? Much of it looks good to me. I have at times thought that it would be

Re: how to avoid leading white spaces

2011-06-03 Thread Steven D'Aprano
On Fri, 03 Jun 2011 22:30:59 -0400, Roy Smith wrote: In article 4de992d7$0$29996$c3e8da3$54964...@news.astraweb.com, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: Of course, if you include both case-sensitive and insensitive tests in the same calculation, that's a good

Re: how to avoid leading white spaces

2011-06-02 Thread Neil Cerutti
On 2011-06-01, ru...@yahoo.com ru...@yahoo.com wrote: For some odd reason (perhaps because they are used a lot in Perl), this groups seems to have a great aversion to regular expressions. Too bad because this is a typical problem where their use is the best solution. Python's str methods,

Re: how to avoid leading white spaces

2011-06-02 Thread Roy Smith
In article 94ph22frh...@mid.individual.net, Neil Cerutti ne...@norwich.edu wrote: On 2011-06-01, ru...@yahoo.com ru...@yahoo.com wrote: For some odd reason (perhaps because they are used a lot in Perl), this groups seems to have a great aversion to regular expressions. Too bad because

Re: how to avoid leading white spaces

2011-06-02 Thread MRAB
On 03/06/2011 02:57, Roy Smith wrote: In article94ph22frh...@mid.individual.net, Neil Ceruttine...@norwich.edu wrote: On 2011-06-01, ru...@yahoo.comru...@yahoo.com wrote: For some odd reason (perhaps because they are used a lot in Perl), this groups seems to have a great aversion to

Re: how to avoid leading white spaces

2011-06-02 Thread Chris Torek
In article 94ph22frh...@mid.individual.net Neil Cerutti ne...@norwich.edu wrote: Python's str methods, when they're sufficent, are usually more efficient. In article roy-e2fa6f.21571602062...@news.panix.com Roy Smith r...@panix.com replied: I was all set to say, prove it! when I decided to

Re: how to avoid leading white spaces

2011-06-02 Thread Roy Smith
In article is9ikg0...@news1.newsguy.com, Chris Torek nos...@torek.net wrote: Python might be penalized by its use of Unicode here, since a Boyer-Moore table for a full 16-bit Unicode string would need 65536 entries (one per possible ord() value). I'm not sure what you mean by full 16-bit

Re: how to avoid leading white spaces

2011-06-02 Thread Chris Angelico
On Fri, Jun 3, 2011 at 1:44 PM, Roy Smith r...@panix.com wrote: In article is9ikg0...@news1.newsguy.com,  Chris Torek nos...@torek.net wrote: Python might be penalized by its use of Unicode here, since a Boyer-Moore table for a full 16-bit Unicode string would need 65536 entries (one per

Re: how to avoid leading white spaces

2011-06-02 Thread Chris Angelico
On Fri, Jun 3, 2011 at 1:52 PM, Chris Angelico ros...@gmail.com wrote: However, Unicode planes 0-2 have all the defined printable characters PS. I'm fully aware that there are ranges defined in page 14 / E. They're non-printing characters, and unlikely to be part of a text string, although it

Re: how to avoid leading white spaces

2011-06-02 Thread Chris Torek
In article is9ikg0...@news1.newsguy.com, Chris Torek nos...@torek.net wrote: Python might be penalized by its use of Unicode here, since a Boyer-Moore table for a full 16-bit Unicode string would need 65536 entries (one per possible ord() value). In article

how to avoid leading white spaces

2011-06-01 Thread rakesh kumar
Hi i have a file which contains data //ACCDJ EXEC DB2UNLDC,DFLID=DFLID,PARMLIB=PARMLIB, // UNLDSYST=UNLDSYST,DATABAS=MBQV1D0A,TABLE='ACCDJ ' //ACCT EXEC DB2UNLDC,DFLID=DFLID,PARMLIB=PARMLIB, // UNLDSYST=UNLDSYST,DATABAS=MBQV1D0A,TABLE='ACCT' //ACCUM

Re: how to avoid leading white spaces

2011-06-01 Thread Chris Rebert
On Wed, Jun 1, 2011 at 12:31 AM, rakesh kumar rakeshkumar.tec...@gmail.com wrote: Hi i have a file which contains data //ACCDJ EXEC DB2UNLDC,DFLID=DFLID,PARMLIB=PARMLIB, // UNLDSYST=UNLDSYST,DATABAS=MBQV1D0A,TABLE='ACCDJ   ' //ACCT  EXEC

Re: how to avoid leading white spaces

2011-06-01 Thread ru...@yahoo.com
On Jun 1, 11:11 am, Chris Rebert c...@rebertia.com wrote: On Wed, Jun 1, 2011 at 12:31 AM, rakesh kumar Hi i have a file which contains data //ACCDJ EXEC DB2UNLDC,DFLID=DFLID,PARMLIB=PARMLIB, // UNLDSYST=UNLDSYST,DATABAS=MBQV1D0A,TABLE='ACCDJ   ' //ACCT 

Re: how to avoid leading white spaces

2011-06-01 Thread Karim
On 06/01/2011 09:39 PM, ru...@yahoo.com wrote: On Jun 1, 11:11 am, Chris Rebertc...@rebertia.com wrote: On Wed, Jun 1, 2011 at 12:31 AM, rakesh kumar Hi i have a file which contains data //ACCDJ EXEC DB2UNLDC,DFLID=DFLID,PARMLIB=PARMLIB, //