Charles Hartman wrote: > On Jun 1, 2005, at 10:33 PM, Matthew S-H wrote:
>> list[currentWord:currentWord + 1] = [word[:-1], word[-1]] > You start with a list of strings, but your code replaces one (or more) > of them, not with a different string or two strings, but with a tuple > whose elements are two strings. The comma is what does that. Well, no. It's a 2-element list, not a tuple (the [ ] make it a list), and he's assigning it to a slice, which should work: >>> l = [1,2,3,4] >>> l[2:3] = [5,6] >>> l [1, 2, 5, 6, 4] >>> So what is going on? I wrote a little test, and inserted a print statement: import string ##Separates words with punctuation into 2 separate words. def puncSep(list): currentWord = -1 for word in list: currentWord = currentWord + 1 print currentWord, word if word[-1] in string.punctuation: ## list = list[:currentWord] + [word[0:-1], word[-1]] + list[currentWord + 1:] list[currentWord:currentWord + 1] = [word[:-1], word[-1]] currentWord = currentWord + 1 return list #L = "This is a sentence.".split() L = ["Word?"] print L print puncSep(L) Running it gave me: [EMAIL PROTECTED] junk $ ./piglatin.py ['Word?'] 0 Word? 2 ? 4 Traceback (most recent call last): File "./piglatin.py", line 113, in ? print puncSep(L) File "./piglatin.py", line 103, in puncSep if word[-1] in string.punctuation: IndexError: string index out of range same error, but I got a hint: the last "word" is an empty string, which is why you got the IndexError. So I added another print statement: print "adding:", [word[:-1], word[-1]] list[currentWord:currentWord + 1] = [word[:-1], word[-1]] [EMAIL PROTECTED] junk $ ./piglatin.py ['Word?'] 0 Word? adding: ['Word', '?'] 2 ? adding: ['', '?'] 4 Traceback (most recent call last): File "./piglatin.py", line 114, in ? print puncSep(L) File "./piglatin.py", line 103, in puncSep if word[-1] in string.punctuation: IndexError: string index out of range So you are adding an empty string. I'm not totally sure why yet, but I see a common trip-up in this code: Never alter a list while iterating through it with a for loop! (actually, it's not never, but don't do it unless you know what you are doing.) I'll check if this is the problem by printing the list as we go: print "adding:", [word[:-1], word[-1]] list[currentWord:currentWord + 1] = [word[:-1], word[-1]] print "the list is now:", list and we get:[EMAIL PROTECTED] junk $ ./piglatin.py ['Word?'] 0 Word? adding: ['Word', '?'] the list is now: ['Word', '?'] 2 ? adding: ['', '?'] the list is now: ['Word', '?', '', '?'] 4 Traceback (most recent call last): File "./piglatin.py", line 115, in ? print puncSep(L) File "./piglatin.py", line 103, in puncSep if word[-1] in string.punctuation: IndexError: string index out of range So what happened? The iteration started with the one word in the list: "Word?". Then that was replaced by two words: ["Word", "?"], Now this list has two elements, so the iteration continues, and the next word in the list is "?". It gets replaced by ["","?"].. whoops, that's not supposed to happen! So what's the solution? two options: 1) make sure you only iterate through the original number of items in the list: replace: for word in list: currentWord = currentWord + 1 with: while currentWord < len(list)-1: currentWord = currentWord + 1 word = list[currentWord] that's a bit ugly, so I'd rather move the increment to the end of the while block: and move the increment to the end of the loop: def puncSep(list): currentWord = 0 while currentWord < len(list)-1: word = list[currentWord] if word[-1] in string.punctuation: list[currentWord:currentWord + 1] = [word[:-1], word[-1]] currentWord += 1 currentWord += 1 return list Another option, and one I'd probably do, is to create a new, list, rather than altering the one you have in place: def puncSep(list): currentWord = 0 newList = [] for currentWord, word in enumerate(list): if word[-1] in string.punctuation: newList.extend([word[:-1], word[-1]]) else: newList.append(word) return newList But what if there is a punctuation mark by itself? (which I suppose is a syntax error in the input, but probably best to check for it): before: ['two', 'words', '?'] after: ['two', 'words', '', '?'] It adds an empty string, which you don't want: if len(word)> 1 and word[-1] in string.punctuation: before: ['two', 'words', '?'] after: ['two', 'words', '?'] There, that's fixed it. Now, a few style issues: 1) don't use "import *", you can get name clashes, and it's hard to know where stuff comes from when you look at your code later. 2) as pointed out, don't use "list" as a variable name 3) use enumerate, if you need to loop through a list, and keep track of the index, though you don't need to anymore here. 4) minor point, but I"m not sure there's much point in using list.extend() when you are creating the list in the argument anyway, so I've just used two append()s Here's my version now: import string ##Separates words with punctuation into 2 separate words. def puncSep(oldList): newList = [] for word in oldList: if len(word)> 1 and word[-1] in string.punctuation: newList.append(word[:-1]) newList.append(word[-1]) else: newList.append(word) return newList L = "This is a sentence. Is this another? Here is one with a lone punctuation mark .".split() #L = ["Word?"] #L = ["two","words", "?"] print "before:", L print "after:", puncSep(L) By the way, this cries out for unit testing of some sort. Read up about it in "Dive Into Python" in print or on the web. For an additional challenge, could you do this with list comprehensions? -Chris Then (I > thnk) you're expecting the size of your list to adjust itself, so that > the index of the last element will be one larger than it was before the > substitution. But the list's length -- the number of elements in the > list -- hasn't changed; it's just that one of them has been replaced > with a tuple, a different kind of object from a string. > > One easy (not necessarily efficient) way to revise it would be (off the > top of my head without testing) > > list[currentWord] = word[:-1] > list.insert(currentWord + 1, word[-1]) > > (though there are more elegant ways to do it without using the > currentWord indexing variable). > > By the way, "list" is an operator in Python (it turns its argument into > a list), so it's a bad idea to use that as the name of a variable. > > Charles Hartman > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Pythonmac-SIG maillist - Pythonmac-SIG@python.org > http://mail.python.org/mailman/listinfo/pythonmac-sig -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception [EMAIL PROTECTED] _______________________________________________ Pythonmac-SIG maillist - Pythonmac-SIG@python.org http://mail.python.org/mailman/listinfo/pythonmac-sig