[EMAIL PROTECTED] wrote: > Thomas Liesner wrote: >> Hi all, >> >> i am having a textfile which contains a single string with names. >> I want to split this string into its records an put them into a list. >> In "normal" cases i would do something like: >> >>> #!/usr/bin/python >>> inp = open("file") >>> data = inp.read() >>> names = data.split() >>> inp.close() >> The problem is, that the names contain spaces an the records are also >> just seprarated by spaces. The only thing i can rely on, ist that the >> recordseparator is always more than a single whitespace. >> >> I thought of something like defining the separator for split() by using >> a regex for "more than one whitespace". RegEx for whitespace is \s, but >> what would i use for "more than one"? \s+? >> > Can I just use "two space" as the seperator ? > > [ x.strip() for x in data.split(" ") ] > If you like, but it will create dummy entries if there are more than two spaces:
>>> data = "Guido van Rossum Tim Peters Thomas Liesner" >>> [ x.strip() for x in data.split(" ") ] ['Guido van Rossum', 'Tim Peters', '', 'Thomas Liesner'] You could add a condition to the listcomp: >>> [name.strip() for name in data.split(" ") if name] ['Guido van Rossum', 'Tim Peters', 'Thomas Liesner'] but what if there is some other whitespace character? >>> data = "Guido van Rossum Tim Peters \t Thomas Liesner" >>> [name.strip() for name in data.split(" ") if name] ['Guido van Rossum', 'Tim Peters', '', 'Thomas Liesner'] >>> perhaps a smarter condition? >>> [name.strip() for name in data.split(" ") if name.strip(" \t")] ['Guido van Rossum', 'Tim Peters', 'Thomas Liesner'] but this is beginning to feel like hard work. I think this is a case where it's not worth the effort to try to avoid the regexp >>> import re >>> re.split("\s{2,}",data) ['Guido van Rossum', 'Tim Peters', 'Thomas Liesner'] >>> Michael -- http://mail.python.org/mailman/listinfo/python-list