On Sat, Nov 12, 2016 at 12:06 PM Steven D'Aprano <st...@pearwood.info> wrote:
> I consider the need for that to indicate a possibly poor design of > pandas. Unless there is a good reason not to, I believe that any > function that requires a list of strings should also accept a single > space-delimited string instead. Especially if the strings are intended > as names or labels. So that: > > func(['fe', 'fi', 'fo', 'fum']) > > and > > func('fe fi fo fum') > > should be treated the same way. > They don't because df[ 'Column Name'] is a valid way to get a single column worth of data when the column name contains spaces (not encouraged, but it is valid). > > mydf = df[ ('field1', 'field2', 'field3') ] > > Are your field names usually constants known when you write the script? > Yes. All the time. When I'm on the side of creating APIs for data analysts to use, I think of the columns abstractly. When they're writing scripts to analyze data, it's all very explicit and in the domain of the data. Things like: df [df.age > 10] adf = df.pivot_table( ['runid','block'] ) Are common and the "right" way to do things in the problem domain. > So not only do we have to learn yet another special kind of string: > > - unicode strings > - byte strings > - raw strings (either unicode or bytes) > - f-strings > - and now w-strings > Very valid point. I also was considering (and rejected) a 'wb' for tuple of bytes. > I would prefer a simple, straight-forward rule: it unconditionally > splits on whitespace. If you need to include non-splitting spaces, use a > proper non-breaking space \u00A0, or split the words into a tuple by > hand, like you're doing now. I don't think it is worth complicating the > feature to support non-splitting spaces. > You're right there. If there are spaces in the columns, make it explicit and don't use the w''. I withdraw the <backspace><space> "feature". And I think you're right that all the existing escape rules should work in the same way they do for regular unicode strings (don't go the raw strings route). Basically, w'foo bar' == tuple('foo bar'.split()) > The fact that other languages do something like this is a (weak) point > in its favour. But I see that there are a few questions on Stackoverflow > asking what %w > means, how it is different from %W, etc. For example: > > http://stackoverflow.com/questions/1274675/what-does-warray-mean > > http://stackoverflow.com/questions/690794/ruby-arrays-w-vs-w > > Well, I'd lean towards not having a W'fields' that does something funky :-). But your point is well taken. > ... > I'm rather luke-warm on this proposal, although I might be convinced to > support it if: > > - w'...' unconditionally split on any whitespace (possibly > excluding NBSP); > > - and normal escapes worked. > > Even then I'm not really convinced this needs to be a language feature. > > I'm realizing that a lot of the reason that I'm seeing this a lot is that it seems to be particular issue to using python for data science. In some ways, they're pushing the language a bit beyond what it's designed to do (the df[ (df.age > 10) & (df.gender=="F")] idiom is amazing and troubling). Since I'm doing a lot of this, these little language issues loom a bit larger than they would with "normal" programming. Thanks for responding.
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/