Zachary Pincus wrote: > Specifically, on line 115 in LineSplitter, we have: > self.delimiter = delimiter.strip() or None > so if I pass in, say, '\t' as the delimiter, self.delimiter gets set > to None, which then causes the default behavior of any-whitespace-is- > delimiter to be used. This makes lines like "Gene Name\tPubMed ID > \tStarting Position" get split wrong, even when I explicitly pass in > '\t' as the delimiter! > > Similarly, I believe that some of the tests are formulated wrong: > def test_nodelimiter(self): > "Test LineSplitter w/o delimiter" > strg = " 1 2 3 4 5 # test" > test = LineSplitter(' ')(strg) > assert_equal(test, ['1', '2', '3', '4', '5']) > > I think that treating an explicitly-passed-in ' ' delimiter as > identical to 'no delimiter' is a bad idea. If I say that ' ' is the > delimiter, or '\t' is the delimiter, this should be treated *just* > like ',' being the delimiter, where the expected output is: > ['1', '2', '3', '4', '', '5'] > > At least, that's what I would expect. Treating contiguous blocks of > whitespace as single delimiters is perfectly reasonable when None is > provided as the delimiter, but when an explicit delimiter has been > provided, it strikes me that the code shouldn't try to further- > interpret it... > > Does anyone else have any opinion here?
I agree. If the user explicity passes something as a delimiter, we should use it and not try to be too smart. +1 Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion