On 9/30/10, Steven D'Aprano <[email protected]> wrote: > On Fri, 1 Oct 2010 08:32:40 am Alex Hall wrote: > >> I fully expected to see txt be an array of strings since I figured >> self.original would have been split on one or more new lines. It >> turns out, though, that I get this instead: >> ['l\nvx vy z\nvx vy z'] > > There's no need to call str() on something that already is a string. > Admittedly it doesn't do much harm, but it is confusing for the person > reading, who may be fooled into thinking that perhaps the argument > wasn't a string in the first place. Agreed. I was having some (unrelated) trouble and was desperate enough to start forcing things to the data type I needed, just in case. > > The string split method doesn't interpret its argument as a regular > expression. r'\n+' has no special meaning here. It's just three literal > characters backslash, the letter n, and the plus sign. split() tries to > split on that substring, and since your data doesn't include that > combination anywhere, returns a list containing a single item: > >>>> "abcde".split("ZZZ") > ['abcde'] Yes, that makes sense. > >> How is it that txt is not an array of the lines in the file, but >> instead still holds \n characters? I thought the manual said read() >> returns a string: > > It does return a string. It is a string including the newline > characters. > > > [...] >> I know I can use f.readline(), and I was doing that before and it all >> worked fine. However, I saw that I was reading the file twice and, in >> the interest of good practice if I ever have this sort of project >> with a huge file, I thought I would try to be more efficient and read >> it once. > > You think that keeping a huge file in memory *all the time* is more > efficient? Ah, I see what you mean now. I work with the data later, so you are saying that it would be better to just read the file as necessary, then then, when I need the file's data later, just read it again. > It's the other way around -- when dealing with *small* files > you can afford to keep it in memory. When dealing with huge files, you > need to re-write your program to deal with the file a piece at a time. > (This is often a good strategy for small files as well, but it is > essential for huge ones.) > > Of course, "small" and "huge" is relative to the technology of the day. > I remember when 1MB was huge. These days, huge would mean gigabytes. > Small would be anything under a few tens of megabytes. > > > -- > Steven D'Aprano > _______________________________________________ > Tutor maillist - [email protected] > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor >
-- Have a great day, Alex (msg sent from GMail website) [email protected]; http://www.facebook.com/mehgcap _______________________________________________ Tutor maillist - [email protected] To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
