subject:"Re\: Using namedtuples field names for column indices in a list of lists"

Re: Using namedtuples field names for column indices in a list of lists

2017-01-12 Thread Michael Torrie

On 01/12/2017 02:26 AM, Deborah Swanson wrote:
> It's true, I've only been on this list a few weeks, although I've seen
> and been on the receiving end of the kind of "help" that feels more like
> being sneered at than help. Not on this list, but on Linux and similar
> lists. There does seem to be a "tough love" approach to "helping"
> people, and I haven't seen that it really helped that much, in other
> places that I've seen it in action over a period of time.  

If you go down a wrong path, people are going to try to warn you.  For
example, you were told several times, no that object really is a list,
yet you argued with them on that point for several posts.  Tough love or
sneering? No, absolutely not.  Communication difficulties?  Yes!  But
not even close to wholly the fault of those who were trying to assist
you. If you haven't been helped, it's not for lack of their trying.

> I'm willing
> though to just see how it works on this list. Since I've been here, I
> haven't seen people come back who get that kind of approach, but a few
> weeks is too short a time to draw conclusions. Still, when people who
> need help don't come right back, that should be a first clue that they
> didn't get it.

Fortunately this list is pretty friendly and open to newbies.  And
long-time posters have no problem telling other long-time posters when
they do cross the line into bullying or trolling territory.

Fortunately I've seen nothing but people wanting to help you with your
ventures in Python programming responding to your queries, which I'm
gratified to see.  And many many newbies have been helped in their
explorations of Python land.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Using namedtuples field names for column indices in a list of lists

2017-01-12 Thread Chris Angelico

On Thu, Jan 12, 2017 at 9:27 PM, Marko Rauhamaa  wrote:
> An instructive anecdote: somebody I know told me he once needed the
> definitive list of some websites. He posted a question on the relevant
> online forum, but it fell on deaf ears. After some days, he replied to
> his own posting saying he had found the list and included the list in
> his reply. He knew the list was probably not current, but it served its
> purpose: right away he got an angry response pointing out his list was
> completely wrong with a link to the up-to-date list.

There's a reason for that. Inaccurate information is worse than none
at all, because it's indistinguishable from accurate information
without deep analysis. Also, often someone won't know the correct
answer, but will recognize a wrong answer. In correcting the record,
you come a bit closer to the truth. Sometimes all it takes is one
wrong answer, and the discussion kicks off.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Using namedtuples field names for column indices in a list of lists

2017-01-12 Thread Marko Rauhamaa

"Deborah Swanson" :

> I've only been on this list a few weeks, although I've seen and been
> on the receiving end of the kind of "help" that feels more like being
> sneered at than help. Not on this list, but on Linux and similar
> lists. There does seem to be a "tough love" approach to "helping"
> people,

An instructive anecdote: somebody I know told me he once needed the
definitive list of some websites. He posted a question on the relevant
online forum, but it fell on deaf ears. After some days, he replied to
his own posting saying he had found the list and included the list in
his reply. He knew the list was probably not current, but it served its
purpose: right away he got an angry response pointing out his list was
completely wrong with a link to the up-to-date list.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

RE: Using namedtuples field names for column indices in a list of lists

2017-01-12 Thread Deborah Swanson

Antoon Pardon wrote, on January 12, 2017 12:49 AM
> 
> Op 11-01-17 om 23:57 schreef Deborah Swanson:
> >
> >> What are we supposed to do when somebody asks a question based on
an 
> >> obvious mistake? Assume that they're a quick learner who has
probably 
> >> already worked out their mistake and doesn't need an answer? That 
> >> would certainly make our life easier: we can just ignore
everybody's 
> >> questions.
> > No, of course not. My advice to people who want to help is to not 
> > assume that you know what the question asker does and doesn't know, 
> > and just answer the questions without obsessing about what 
> they know.
> 
> With all respect, such an answer betrays not much experience 
> on this list. Half the time answering in this way would mean 
> making very little progress in actually helping the person. 
> There is an important difference in trying to help someone 
> and just answering his questions. And your advice may be the 
> best way to help someone like you but not everyone is like 
> you. A lot of people have been helped by a remark that didn't 
> answer the question.

It's true, I've only been on this list a few weeks, although I've seen
and been on the receiving end of the kind of "help" that feels more like
being sneered at than help. Not on this list, but on Linux and similar
lists. There does seem to be a "tough love" approach to "helping"
people, and I haven't seen that it really helped that much, in other
places that I've seen it in action over a period of time.  I'm willing
though to just see how it works on this list. Since I've been here, I
haven't seen people come back who get that kind of approach, but a few
weeks is too short a time to draw conclusions. Still, when people who
need help don't come right back, that should be a first clue that they
didn't get it.
 
> >  If that's
> > impossible because they have something so wrong that you don't know 
> > what they're asking, that would be a good time to point it out and 
> > give them a chance to correct it.
> 
> It is rarely a question of impossibility. It often enough is 
> a sense that the person asking the question is approaching 
> the problem from the wrong side. Often enough that sense is 
> correct, often enough that sense is wrong. All the 
> participants can do is take a clue from the question and then 
> guess what respons would help this person best.

This sounds right to me. Any list or forum that fields questions has the
problem of understanding the questioner who's writing in. Any strategy
that briges that gap seems like a good one to me.

> Nobody can expect that this list will treat their questions 
> in a way that suits their personal style.
> 
> -- 
> Antoon Pardon
> 
> 
Oh, I'm sure that's true, though I do think more direct question asking
and answering is always helpful. Communication in lists and forums is
somewhat in the dark, because there's so little context in many of the
conversations. Questions (and waiting for the answers before responding)
are an excellent way to fill in some of the dark spaces.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Using namedtuples field names for column indices in a list of lists

2017-01-12 Thread Antoon Pardon

Op 11-01-17 om 23:57 schreef Deborah Swanson:
>
>> What are we supposed to do when somebody asks a question based on an
>> obvious mistake? Assume that they're a quick learner who has probably 
>> already worked out their mistake and doesn't need an answer? That 
>> would certainly make our life easier: we can just ignore everybody's 
>> questions.
> No, of course not. My advice to people who want to help is to not assume
> that you know what the question asker does and doesn't know, and just
> answer the questions without obsessing about what they know.

With all respect, such an answer betrays not much experience on this
list. Half the time answering in this way would mean making very little
progress in actually helping the person. There is an important difference
in trying to help someone and just answering his questions. And your
advice may be the best way to help someone like you but not everyone
is like you. A lot of people have been helped by a remark that didn't
answer the question.

>  If that's
> impossible because they have something so wrong that you don't know what
> they're asking, that would be a good time to point it out and give them
> a chance to correct it.

It is rarely a question of impossibility. It often enough is a sense
that the person asking the question is approaching the problem from
the wrong side. Often enough that sense is correct, often enough that
sense is wrong. All the participants can do is take a clue from the
question and then guess what respons would help this person best.

Nobody can expect that this list will treat their questions in a way
that suits their personal style.

-- 
Antoon Pardon


-- 
https://mail.python.org/mailman/listinfo/python-list

RE: Using namedtuples field names for column indices in a list of lists

2017-01-11 Thread Deborah Swanson

Steven D'Aprano wrote, on January 10, 2017 6:19 PM
> 
> On Tuesday 10 January 2017 18:14, Deborah Swanson wrote:
> 
> > I'm guessing that you (and those who see things like you do) might 
> > not be used to working with quick learners who make mistakes at 
> > first but catch up with them real fast, or you're very judgemental 
> > about people who make mistakes, period. I certainly don't care if 
> > you want to judge me, you're entitled to your opinion.
> 
> Be fair. We're not mind-readers, and we're trying to help you, not 
> attack you.

You aren't, and I've never seen you as attacking me, but there's a few
others here who have attacked me, and they weren't trying to help me
either. But it's only a very few people who've harrassed me, and the
person I was writing to above wasn't one of them. 

> It is true that we're often extremely pedantic. Sometimes annoyingly
> so, but on the other hand precision is important, especially in 
> technical fields. > There's a *huge* difference between a MTA and a 
> MUA, even though they differ only by one letter and both are related 
> to email.
>
> One of the techs I work with has the same habit of correcting me, and 
> when I'm invariably forced to agree that he is technical correct, he 
> replies "technical correctness is the best kind of correctness". 
> Annoying as it is to be on the receiving end, he is right: at least 
> half the time I learn something from his pedantry. (The other half of 
> the time, its a difference that makes no difference.)

I'm sorry you have to work with someone like that, he sounds perfectly
awful. (But that doesn't give you license to do the same. You're better
than that.)

> What are we supposed to do when somebody asks a question based on an
> obvious mistake? Assume that they're a quick learner who has probably 
> already worked out their mistake and doesn't need an answer? That 
> would certainly make our life easier: we can just ignore everybody's 
> questions.

No, of course not. My advice to people who want to help is to not assume
that you know what the question asker does and doesn't know, and just
answer the questions without obsessing about what they know. If that's
impossible because they have something so wrong that you don't know what
they're asking, that would be a good time to point it out and give them
a chance to correct it. 

> Sometimes people make errors of terminology that doesn't affect the 
> semantics of what they're asking:
> 
> "I have an array [1, 2, 3, 4] and I want to ..."
> 
> It's very likely that they have a list and they're merely using a term

> they're familiar with from another language, rather than the array 
> module.
> 
> But what are we supposed to make of mistakes that *do* affect the 
> semantics of the question:
> 
> "I have a dict of ints {1, 2, 3, 4} and want to sort the values by

> key so that I get [4, 2, 3, 1], how do I do that?"
> 
> How can we answer that without correcting their misuse of terminology
> and asking for clarification of what data structure they *actually* 
> have?
>
> We get questions like this very frequently. How would you answer it?

Well, I'd tell them how to reverse sort a dictionary, and point out that
what they've given isn't a dictionary because it doesn't have any keys,
and on this occasion I'd just give them an example of what a dictionary
looks like (as part of showing them how to reverse sort it) with its
keys, including the curly braces, and see what they come back with. They
pretty clearly mean a dictionary and not a list, since they said "dict",
used curly braces, and said they want to sort the values by key" in the
first clause of their sentence. So they're just a little confused and
maybe absent-mindedly slipping back into the more familiar list notation
and concepts, or they don't exactly know what a dictionary is. I
wouldn't belabor the point at this time, unless they keep coming back
with the same issues. That would be the time to belabor it, in my
opinion.  When some people are learning it's hard to keep all the new
things firmly in mind and not confuse them with more familiar things
they already know. But if they get it all straight in reasonably good
time, that should be ok. 

> Or:
> 
> "I have a list l = namedtuple('l', 'field1 field2') and can't 
> extract fields by index, l[0] doesn't work..."
> 
> Of course it doesn't work. How would you respond to that if you were
> in our place?
> 
> - ignore the question because the poster is ever so smart and will
have
>   worked it out by now?
> 
> - point out that l is not a list, or even a tuple, and of course l[0] 
>   doesn't work because l is actually a class?

I'd go with some variant of option 2, depending on how well I knew the
person asking. If the asker had just dropped in out of the blue and I
knew nothing about them I'd say something like "You can't because 'l'
isn't a list." Then I'd try to gauge how useful it would be to them to
know exactly what 'l' is, but most likely

Re: Using namedtuples field names for column indices in a list of lists

2017-01-11 Thread BartC


On 11/01/2017 02:18, Steven D'Aprano wrote:

On Tuesday 10 January 2017 18:14, Deborah Swanson wrote:


I'm guessing that you (and those who
see things like you do) might not be used to working with quick learners
who make mistakes at first but catch up with them real fast, or you're
very judgemental about people who make mistakes, period. I certainly
don't care if you want to judge me, you're entitled to your opinion.


Be fair. We're not mind-readers, and we're trying to help you, not attack you.

It is true that we're often extremely pedantic. Sometimes annoyingly so, but on
the other hand precision is important, especially in technical fields. There's
a *huge* difference between a MTA and a MUA, even though they differ only by
one letter and both are related to email.


There's a bigger difference between USB and USA!

--
Bartc


--
https://mail.python.org/mailman/listinfo/python-list

RE: Using namedtuples field names for column indices in a list of lists

2017-01-10 Thread Steven D'Aprano

On Tuesday 10 January 2017 18:14, Deborah Swanson wrote:

> I'm guessing that you (and those who
> see things like you do) might not be used to working with quick learners
> who make mistakes at first but catch up with them real fast, or you're
> very judgemental about people who make mistakes, period. I certainly
> don't care if you want to judge me, you're entitled to your opinion.

Be fair. We're not mind-readers, and we're trying to help you, not attack you.

It is true that we're often extremely pedantic. Sometimes annoyingly so, but on 
the other hand precision is important, especially in technical fields. There's 
a *huge* difference between a MTA and a MUA, even though they differ only by 
one letter and both are related to email.

One of the techs I work with has the same habit of correcting me, and when I'm 
invariably forced to agree that he is technical correct, he replies "technical 
correctness is the best kind of correctness". Annoying as it is to be on the 
receiving end, he is right: at least half the time I learn something from his 
pedantry. (The other half of the time, its a difference that makes no 
difference.)

What are we supposed to do when somebody asks a question based on an obvious 
mistake? Assume that they're a quick learner who has probably already worked 
out their mistake and doesn't need an answer? That would certainly make our 
life easier: we can just ignore everybody's questions.

Sometimes people make errors of terminology that doesn't affect the semantics 
of what they're asking:

"I have an array [1, 2, 3, 4] and I want to ..."

It's very likely that they have a list and they're merely using a term they're 
familiar with from another language, rather than the array module.

But what are we supposed to make of mistakes that *do* affect the semantics of 
the question:

"I have a dict of ints {1, 2, 3, 4} and want to sort the values by 
key so that I get [4, 2, 3, 1], how do I do that?"

How can we answer that without correcting their misuse of terminology and 
asking for clarification of what data structure they *actually* have?

We get questions like this very frequently. How would you answer it?

Or:

"I have a list l = namedtuple('l', 'field1 field2') and can't 
extract fields by index, l[0] doesn't work..."

Of course it doesn't work. How would you respond to that if you were in our 
place?

- ignore the question because the poster is ever so smart and will have
  worked it out by now?

- point out that l is not a list, or even a tuple, and of course l[0] 
  doesn't work because l is actually a class?

Its easy to criticise us for answering the questions you ask instead of the 
questions you intended, but we're not mind-readers. We don't know what you're 
thinking, we only know what you communicate to us. Telling us off for answering 
your questions is hardly likely to encourage us to answer them in the future.

-- 
Steven
"Ever since I learned about confirmation bias, I've been seeing 
it everywhere." - Jon Ronson

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Using namedtuples field names for column indices in a list of lists

2017-01-10 Thread Ethan Furman


On 01/09/2017 11:14 PM, Deborah Swanson wrote:


So I guess you should just do your thing and I'll do mine.


As you say.


Takes all kinds, and I think in the end what will count is the quality of my
finished work (which has always been excellent), and not the messy
process to get there.


Agreed.

--
~Ethan~
--
https://mail.python.org/mailman/listinfo/python-list

RE: Using namedtuples field names for column indices in a list of lists

2017-01-09 Thread Deborah Swanson

Ethan Furman wrote, on January 09, 2017 10:06 PM
> 
> On 01/09/2017 08:51 PM, Deborah Swanson wrote:
> > Ethan Furman wrote, on January 09, 2017 8:01 PM
> 
> >> As I said earlier, I admire your persistence -- but take some time 
> >> and learn the basic vocabulary as that will make it much easier for

> >> you to ask questions, and for us to give you meaningful answers.
> >
> > As I mentioned, I have completed MIT's 2 introductory Python courses

> > with final grades of 98% and 97%.  What tutorials do you think would

> > significantly add to that introduction?
> 
> The Python version of "Think like a computer scientist" is 
> good.  Otherwise, ask the list for recommendations.  I'm not 
> suggesting more advanced topics, but rather basic topics such 
> as how the REPL works, how to tell what objects you have, how 
> to find the methods those objects have, etc.

I'm working on all of that, and I've been taking physics, chem, computer
science and theoretical mathematics (straight 4.0s my last 2 years,
graduated summa cum laude), but it's been a couple of decades since my
last brick building university course. It's coming back fast, but
there's a lot I'm still pulling out of cobwebs. I basically know the
tools to use for the things you mention, but in the online coursework
I've done in the past year, we didn't need to use them because they told
us everything. So that's another thing I'm doing catch up on. Really
shouldn't take too long, it's not that complicated or difficult.

> > It's true that I didn't spend much time in the forums while I was 
> > taking those courses, so this is the first time I've talked with 
> > people about Python this intensively. But I'm a good learner and I'm

> > picking up a lot of it pretty quickly. People on the list also talk 
> > and comprehend differently than people in the MIT courses did, so I 
> > have to become accustomed to this as well. And the only place to
learn 
> > that is right here.
> 
> Indeed.
> 
> The issue I (and others) see, though, is more along the lines 
> of basic understanding: you seemed to think that a list of 
> lists should act the same as a list of tuples, even though 
> lists and tuples are not the same thing.  It's like expecting 
> a basket of oranges to behave like a basket of avocados. ;)
> 
> As you say, you're making good progress.
> 
> --
> ~Ethan~

I'm sorry, I didn't think a list of namedtuples would be like a list of
lists when I wrote to Erik just a bit ago today, but I sort of did when
I first started trying to use them a couple days ago. And to the extent
I was pretty sure they weren't the same, I still didn't know in what
ways they were different. So when people ask me questions about why I
did things the way I did, I try to explain that I didn't know certain
things then, but I know them now. I'm guessing that you (and those who
see things like you do) might not be used to working with quick learners
who make mistakes at first but catch up with them real fast, or you're
very judgemental about people who make mistakes, period. I certainly
don't care if you want to judge me, you're entitled to your opinion. One
of my MIT professors was always screwing up little things, even things
he knew inside out. Some people are like that, and I think I might be
one of them, although I'm really not as bad as he was.

So I guess you should just do your thing and I'll do mine. I don't
promise that I'll ever get to the point where I never set a foot wrong.
I know some people can do that, and I hope someday I will. But then I
remember the MIT professor, who I really learned a lot from, despite all
his flub-ups, and that I might be a little bit like him. Takes all
kinds, and I think in the end what will count is the quality of my
finished work (which has always been excellent), and not the messy
process to get there.

It shocked me the first time someone on this list jumped down my throat
for a momentary lapse on my part, as if I was a total idiot and knew
nothing, but I'm sort of getting used to that too. It must be nice to be
so perfect, I guess.

Deborah

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Using namedtuples field names for column indices in a list of lists

2017-01-09 Thread Ethan Furman


On 01/09/2017 08:51 PM, Deborah Swanson wrote:

Ethan Furman wrote, on January 09, 2017 8:01 PM



As I said earlier, I admire your persistence -- but take some
time and learn the basic vocabulary as that will make it much
easier for you to ask questions, and for us to give you
meaningful answers.


As I mentioned, I have completed MIT's 2 introductory Python courses
with final grades of 98% and 97%.  What tutorials do you think would
significantly add to that introduction?


The Python version of "Think like a computer scientist" is good.  Otherwise, 
ask the list for recommendations.  I'm not suggesting more advanced topics, but rather 
basic topics such as how the REPL works, how to tell what objects you have, how to find 
the methods those objects have, etc.
 

It's true that I didn't spend much time in the forums while I was taking
those courses, so this is the first time I've talked with people about
Python this intensively. But I'm a good learner and I'm picking up a lot
of it pretty quickly. People on the list also talk and comprehend
differently than people in the MIT courses did, so I have to become
accustomed to this as well. And the only place to learn that is right
here.


Indeed.

The issue I (and others) see, though, is more along the lines of basic 
understanding: you seemed to think that a list of lists should act the same as 
a list of tuples, even though lists and tuples are not the same thing.  It's 
like expecting a basket of oranges to behave like a basket of avocados. ;)

As you say, you're making good progress.

--
~Ethan~
--
https://mail.python.org/mailman/listinfo/python-list

RE: Using namedtuples field names for column indices in a list of lists

2017-01-09 Thread Deborah Swanson

Erik wrote, on January 09, 2017 8:06 PM
> 
> On 10/01/17 03:02, Deborah Swanson wrote:
> > Erik wrote, on January 09, 2017 5:47 PM
> >> IIRC, you create it using a list comprehension which creates the 
> >> records. A list comprehension always creates a list.
> >
> > Well no. The list is created with:
> >
> > records.extend(Record._make(row) for row in rows)
> 
> No, the list is _extended_ by that code. The list is _created_ with a 
> line that will say something like "records = []" or "records 
> = list()" 
> (or "records = ").

The list was created with this code:

infile = open("E:\\Coding projects\\Pycharm\\Moving\\Moving 2017
in.csv")
rows = csv.reader(infile)
fieldnames = next(rows)
Record = namedtuple("Record", fieldnames)
records = [Record._make(fieldnames)]
records.extend(Record._make(row) for row in rows)

I just pulled out the .extend statement to show you because that's what
looks like a list comprehension, but turns out not to be one.  We still
get a list though, on that we agree.  ;)
 
> It's nice to see you agree that it's a list though. Oh, hold on ... ;)
> 
> > I'm not exactly
> > sure if this statement is a list comprehension.
> 
> No, it's not. I was remembering an old message where someone 
> suggested 
> using the _make() method and that was expressed as a list 
> comprehension.
> 
> What you have there is a call to list.extend() passing a _generator_ 
> comprehension as its only parameter (which in this case you 
> can consider 
> to be equivalent to a list comprehension as all of the data are 
> generated greedily). You see that I said "list.extend()". 
> That's because 
> 'records' is a list.
> 
>  type(records)
> > 
> 
> Yes, it's an instance of the list class. A list object. A list.
> 
>  >>> type(list())
> 
>  >>> type([])
> 
>  >>> class foo: pass
> ...
>  >>> type(foo())
> 
>  >>>
> 
> ... type() will tell you what class your object is an instance of. 
> "" tells you that your object is a list.
> 
> > And it behaves like a list sometimes, but many times
> > not.
> 
> I think that's impossible. I'm 100% sure it's a list. Please give an 
> example of when 'records' does not behave like a list.

I gave an example in one of my last two replies to other people. The
thing is that it's a list, but it's not a list of lists. It's a list of
namedtuples, and the non-listlike behaviors appear when I'm directly
working with the namedtuples.

> > The only thing I don't think you have 100% correct is your assertion

> > that records is a list.
> 
> It's a list.

I agree, now. 
 
> > But that's just a quibble. The important thing in this context is
that 
> > both .sort() and sorted() treat it like a list and DTRT.
> 
> That's because it's a list :)
> 
> E.

It is!  A list of namedtuples that is, not a list of lists. sorted() and
.sort work because they only interact with the outer data structure,
which is a list.

-- 
https://mail.python.org/mailman/listinfo/python-list

RE: Using namedtuples field names for column indices in a list of lists

2017-01-09 Thread Deborah Swanson

Ethan Furman wrote, on January 09, 2017 8:01 PM
> 
> On 01/09/2017 07:02 PM, Deborah Swanson wrote:
> > Erik wrote, on January 09, 2017 5:47 PM
> 
> >> As people keep saying, the object you have called 'records' is a 
> >> *list* of namedtuple objects. It is not a namedtuple.
> >>
> >> IIRC, you create it using a list comprehension which creates the 
> >> records. A list comprehension always creates a list.
> >
> > Well no. The list is created with:
> >
> > records.extend(Record._make(row) for row in rows)
> >
> > I'm new to both namedtuples and list comprehensions, so I'm not 
> > exactly sure if this statement is a list comprehension. It 
> looks like 
> > it could be. In any case I recreated records in IDLE and got
> >
> >--> type(records)
> > 
> >
> > So it's a class, derived from list? (Not sure what the 
> 'list' means.)
> 
> On the one hand, Deborah, I applaud your perseverance.  On 
> the other, it seems as if you trying to run before you can 
> walk.  I know tutorials can be boring, but you really should 
> go through one so you have a basic understanding of the fundamentals.

I actually have had a solid foundation of study in 2 terms of MIT's
introductory Python courses. But they can't cover everything in that
short a time.

> Working in the REPL (the python console), we can see:
> 
> Python 3.4.0 (default, Apr 11 2014, 13:05:18)
> ...
> --> type(list)
> 
> -->
> --> type(list())
> 
> --> type([1, 2, 3])
> 
> 
> So the `list` type is 'type', and the type of list instances 
> is 'class list'.

I just saw that while replying to MRAB. 'records' has type list, but
it's only the outer data structure that's a list. Inside, all the
records are namedtuples, and I think that accounts for behaviors that
are unlike a list of lists. (And the reason I was reluctant to accept
that it could be sorted until I tried it for myself.) The method calls I
was able to use were from the namedtuples, not the list of namedtuples.

> Your records variable is an instance of a list filled with 
> instances of a namedtuple, 'Record'.  One cannot sort a 
> namedtuple, but one can sort a list of namedtuples -- which 
> is what you are doing.

Yes, I think we've got that straight now.

> As I said earlier, I admire your persistence -- but take some 
> time and learn the basic vocabulary as that will make it much 
> easier for you to ask questions, and for us to give you 
> meaningful answers.
> 
> --
> ~Ethan~

As I mentioned, I have completed MIT's 2 introductory Python courses
with final grades of 98% and 97%.  What tutorials do you think would
significantly add to that introduction?

It's true that I didn't spend much time in the forums while I was taking
those courses, so this is the first time I've talked with people about
Python this intensively. But I'm a good learner and I'm picking up a lot
of it pretty quickly. People on the list also talk and comprehend
differently than people in the MIT courses did, so I have to become
accustomed to this as well. And the only place to learn that is right
here. 

-- 
https://mail.python.org/mailman/listinfo/python-list

RE: Using namedtuples field names for column indices in a list of lists

2017-01-09 Thread Deborah Swanson

MRAB wrote, on January 09, 2017 7:37 PM
> 
> On 2017-01-10 03:02, Deborah Swanson wrote:
> > Erik wrote, on January 09, 2017 5:47 PM
> >> As people keep saying, the object you have called 'records' is a 
> >> *list* of namedtuple objects. It is not a namedtuple.
> >>
> >> IIRC, you create it using a list comprehension which creates the 
> >> records. A list comprehension always creates a list.
> >
> > Well no. The list is created with:
> >
> > records.extend(Record._make(row) for row in rows)
> >
> > I'm new to both namedtuples and list comprehensions, so I'm not 
> > exactly sure if this statement is a list comprehension. It 
> looks like 
> > it could be.
>  >
> This is a list comprehension:
> 
>  [Record._make(row) for row in rows]
> 
> and this is a generator expression:
> 
>  (Record._make(row) for row in rows)
> 
> It needs the outer parentheses.
> 
> The .extend method will accept any iterable, including list 
> comprehensions:
> 
>  records.extend([Record._make(row) for row in rows])
> 
> and generator expressions:
> 
>  records.extend((Record._make(row) for row in rows))
> 
> In the latter case, the generator expression is the only 
> argument of the 
> .extend method, and Python lets us drop the pair of parentheses:
> 
>  records.extend(Record._make(row) for row in rows)
> 
> If there were another argument, it would be ambiguous and 
> Python would 
> complain.

Appreciate your explanation of why this statement looks like a list
comprehension, but it isn't one.
 
> > In any case I recreated records in IDLE and got
> >
>  type(records)
> > 
> >
> > So it's a class, derived from list? (Not sure what the 
> 'list' means.) 

>>> [1,2,3]
[1, 2, 3]
>>> type(_)


So it is a list, despite not being made by a list comprehension and
despite its non-listlike behaviors. Guess I've never looked at the type
of a list before, probably because lists are so obvious by looking at
them.

> > 'records' is in fact a class, it has an fget method and data members

> > that I've used. And it behaves like a list sometimes, but many times
not.
> >
> Its type is 'list', so it's an instance of a list, i.e. it's a list!

As testified by IDLE above!  ;)  A list of namedtuples may be an
instance of a list, but it doesn't always behave like a list of lists.
For example, if you want to modify an element of a record in records,
you can't just say 

'record.Location = Tulsa' 

like you can say 

'record[Location] = Tulsa'

because each record is very much like a tuple, and tuples are immutable.
You have to use the _replace function:

record._replace(Location) = Tulsa

This is very unlike a list of lists. Only the outer data structure is a
list, and inside it's all namedtuples.

So it's not a list of lists, it's a list of namedtuples.  But .sort and 
sorted() DTRT, and that's valuable.

> > The only reason I've hedged away from advice to treat records as a 
> > list for sorting until I tried it for myself, was because of an
awful 
> > lot of strange behavior I've seen, while trying to do the same
things 
> > with namedtuples as I routinely do with scalars and lists. This is
all 
> > new, and until now, unexplored territory for me. And I generally
avoid 
> > saying I'm sure or outright agreeing with something unless I really
do 
> > know it.
> >
> >> The sorted() function and the list.sort() method can be used to
sort 
> >> a list containing any objects - it's just a case of telling them
how 
> >> to obtain the key values to compare (which, in the case of
> >> simple attribute access which the namedtuple objects allow,
> >> "operator.attrgetter()" will
> >> do that). This is why sorting the list works for you.
> >>
> >> You could sort objects of different types - but you might need to 
> >> supply a function instead of operator.attrgetter() which looks at
> >> the type of
> >> each object and returns something that's obtained differently
> >> for each
> >> type (but which the sort function can compare).
> >>
> >>
> >>
> >>
> >> When you say 'Foo = namedtuple("Foo", "spam ham")', you 
> are creating 
> >> a "factory" which is able to generate "Foo" objects for you.
> >>
> >> When you say "x = Foo(1, 2)" you are using the factory to 
> create an 
> >> object for you which has its "spam" and "ham" attributes 
> set to the 
> >> values 1 and 2 respectively.
> >>
> >> When you say "records = [Foo(x, y) for x, y in 
> some_iterable()]", you 
> >> are creating a list of such objects. This is the thing you 
> are then 
> >> sorting.
> >>
> >>
> >>
> >> Does that make sense?
> >>
> >> Regards, E.
> >
> > Perfect sense. And now that I've confirmed in code that both
sorted() and
> > .sort() behave as hoped for with namedtuples, I couldn't be happier.

> > ;)
> >
> > The only thing I don't think you have 100% correct is your assertion

> > that records is a list. And I'm really not sure now that 
> > 
> > records.extend(Record._make(row) for row in rows)
> >
> > is a list comprehension.
> >
> > That's the last statement

Re: Using namedtuples field names for column indices in a list of lists

2017-01-09 Thread Erik

On 10/01/17 03:02, Deborah Swanson wrote:

Erik wrote, on January 09, 2017 5:47 PM

IIRC, you create it using a list comprehension which creates the
records. A list comprehension always creates a list.

Well no. The list is created with:

records.extend(Record._make(row) for row in rows)

No, the list is _extended_ by that code. The list is _created_ with a 
line that will say something like "records = []" or "records = list()" 
(or "records = ").

It's nice to see you agree that it's a list though. Oh, hold on ... ;)

I'm not exactly
sure if this statement is a list comprehension.

No, it's not. I was remembering an old message where someone suggested 
using the _make() method and that was expressed as a list comprehension.

What you have there is a call to list.extend() passing a _generator_ 
comprehension as its only parameter (which in this case you can consider 
to be equivalent to a list comprehension as all of the data are 
generated greedily). You see that I said "list.extend()". That's because 
'records' is a list.

type(records)

Yes, it's an instance of the list class. A list object. A list.

>>> type(list())

>>> type([])

>>> class foo: pass
...
>>> type(foo())

>>>

... type() will tell you what class your object is an instance of. 
"" tells you that your object is a list.

And it behaves like a list sometimes, but many times
not.

I think that's impossible. I'm 100% sure it's a list. Please give an 
example of when 'records' does not behave like a list.

The only thing I don't think you have 100% correct is your assertion
that records is a list.

It's a list.

But that's just a quibble. The important thing in this context is that
both .sort() and sorted() treat it like a list and DTRT.

That's because it's a list :)

E.
--
https://mail.python.org/mailman/listinfo/python-list

Re: Using namedtuples field names for column indices in a list of lists

2017-01-09 Thread Ethan Furman


On 01/09/2017 07:02 PM, Deborah Swanson wrote:

Erik wrote, on January 09, 2017 5:47 PM



As people keep saying, the object you have called 'records'
is a *list*
of namedtuple objects. It is not a namedtuple.

IIRC, you create it using a list comprehension which creates the
records. A list comprehension always creates a list.


Well no. The list is created with:

records.extend(Record._make(row) for row in rows)

I'm new to both namedtuples and list comprehensions, so I'm not exactly
sure if this statement is a list comprehension. It looks like it could
be. In any case I recreated records in IDLE and got

--> type(records)


So it's a class, derived from list? (Not sure what the 'list' means.)


On the one hand, Deborah, I applaud your perseverance.  On the other, it seems 
as if you trying to run before you can walk.  I know tutorials can be boring, 
but you really should go through one so you have a basic understanding of the 
fundamentals.

Working in the REPL (the python console), we can see:

Python 3.4.0 (default, Apr 11 2014, 13:05:18)
...
--> type(list)

-->
--> type(list())

--> type([1, 2, 3])


So the `list` type is 'type', and the type of list instances is 'class list'.

Your records variable is an instance of a list filled with instances of a 
namedtuple, 'Record'.  One cannot sort a namedtuple, but one can sort a list of 
namedtuples -- which is what you are doing.

As I said earlier, I admire your persistence -- but take some time and learn 
the basic vocabulary as that will make it much easier for you to ask questions, 
and for us to give you meaningful answers.

--
~Ethan~
--
https://mail.python.org/mailman/listinfo/python-list

Re: Using namedtuples field names for column indices in a list of lists

2017-01-09 Thread MRAB


On 2017-01-10 03:02, Deborah Swanson wrote:

Erik wrote, on January 09, 2017 5:47 PM

As people keep saying, the object you have called 'records'
is a *list*
of namedtuple objects. It is not a namedtuple.

IIRC, you create it using a list comprehension which creates the
records. A list comprehension always creates a list.


Well no. The list is created with:

records.extend(Record._make(row) for row in rows)

I'm new to both namedtuples and list comprehensions, so I'm not exactly
sure if this statement is a list comprehension. It looks like it could
be.

>
This is a list comprehension:

[Record._make(row) for row in rows]

and this is a generator expression:

(Record._make(row) for row in rows)

It needs the outer parentheses.

The .extend method will accept any iterable, including list comprehensions:

records.extend([Record._make(row) for row in rows])

and generator expressions:

records.extend((Record._make(row) for row in rows))

In the latter case, the generator expression is the only argument of the 
.extend method, and Python lets us drop the pair of parentheses:


records.extend(Record._make(row) for row in rows)

If there were another argument, it would be ambiguous and Python would 
complain.



In any case I recreated records in IDLE and got


type(records)



So it's a class, derived from list? (Not sure what the 'list' means.)
'records' is in fact a class, it has an fget method and data members
that I've used. And it behaves like a list sometimes, but many times
not.


Its type is 'list', so it's an instance of a list, i.e. it's a list!


The only reason I've hedged away from advice to treat records as a list
for sorting until I tried it for myself, was because of an awful lot of
strange behavior I've seen, while trying to do the same things with
namedtuples as I routinely do with scalars and lists. This is all new,
and until now, unexplored territory for me. And I generally avoid saying
I'm sure or outright agreeing with something unless I really do know it.


The sorted() function and the list.sort() method can be used
to sort a
list containing any objects - it's just a case of telling them how to
obtain the key values to compare (which, in the case of
simple attribute
access which the namedtuple objects allow,
"operator.attrgetter()" will
do that). This is why sorting the list works for you.

You could sort objects of different types - but you might
need to supply
a function instead of operator.attrgetter() which looks at
the type of
each object and returns something that's obtained differently
for each
type (but which the sort function can compare).




When you say 'Foo = namedtuple("Foo", "spam ham")', you are
creating a
"factory" which is able to generate "Foo" objects for you.

When you say "x = Foo(1, 2)" you are using the factory to create an
object for you which has its "spam" and "ham" attributes set to the
values 1 and 2 respectively.

When you say "records = [Foo(x, y) for x, y in some_iterable()]", you
are creating a list of such objects. This is the thing you
are then sorting.



Does that make sense?

Regards, E.


Perfect sense. And now that I've confirmed in code that both sorted()
and
.sort() behave as hoped for with namedtuples, I couldn't be happier.  ;)

The only thing I don't think you have 100% correct is your assertion
that records is a list. And I'm really not sure now that

records.extend(Record._make(row) for row in rows)

is a list comprehension.

That's the last statement in the creation of 'records', and immediately
after that statement executes, the type function says the resulting
'records' is a class, probably derived from list, but it's not a
straight up list.

'records' is enough different that you can't assume across the board
that namedtuples created this way are equivalent to a list. You do run
into problems if you assume it behaves like a list, or even like
standard tuples, because it doesn't always. Believe me, when I first
started working with namedtuples, I got plenty snarled up debugging code
that was written assuming list behavior to know that a namedtuple of
namedtuples is not exactly a list. Or even exactly like a list.

But that's just a quibble. The important thing in this context is that
both .sort() and sorted() treat it like a list and DTRT.  And that's
very nice. ;)

The list class has the .sort method, which sorts in-place. The 'sorted' 
function is a simple function that takes an iterable, iterates over it 
to build a list, sorts that list in-place, and then returns the list.


The oft-stated rule is that not every 2- or 3-line function needs to be 
a built-in, but 'sorted' is one of those cases where it's just nice to 
have it, a case of "practicality beats purity".


--
https://mail.python.org/mailman/listinfo/python-list

RE: Using namedtuples field names for column indices in a list of lists

2017-01-09 Thread Deborah Swanson

Erik wrote, on January 09, 2017 5:47 PM
> As people keep saying, the object you have called 'records' 
> is a *list* 
> of namedtuple objects. It is not a namedtuple.
> 
> IIRC, you create it using a list comprehension which creates the 
> records. A list comprehension always creates a list.

Well no. The list is created with:

records.extend(Record._make(row) for row in rows)

I'm new to both namedtuples and list comprehensions, so I'm not exactly
sure if this statement is a list comprehension. It looks like it could
be. In any case I recreated records in IDLE and got

>>> type(records)


So it's a class, derived from list? (Not sure what the 'list' means.)
'records' is in fact a class, it has an fget method and data members
that I've used. And it behaves like a list sometimes, but many times
not.

The only reason I've hedged away from advice to treat records as a list
for sorting until I tried it for myself, was because of an awful lot of
strange behavior I've seen, while trying to do the same things with
namedtuples as I routinely do with scalars and lists. This is all new,
and until now, unexplored territory for me. And I generally avoid saying
I'm sure or outright agreeing with something unless I really do know it.
 
> The sorted() function and the list.sort() method can be used 
> to sort a 
> list containing any objects - it's just a case of telling them how to 
> obtain the key values to compare (which, in the case of 
> simple attribute 
> access which the namedtuple objects allow, 
> "operator.attrgetter()" will 
> do that). This is why sorting the list works for you.
> 
> You could sort objects of different types - but you might 
> need to supply 
> a function instead of operator.attrgetter() which looks at 
> the type of 
> each object and returns something that's obtained differently 
> for each 
> type (but which the sort function can compare).
> 
> 
> 
> 
> When you say 'Foo = namedtuple("Foo", "spam ham")', you are 
> creating a 
> "factory" which is able to generate "Foo" objects for you.
> 
> When you say "x = Foo(1, 2)" you are using the factory to create an 
> object for you which has its "spam" and "ham" attributes set to the 
> values 1 and 2 respectively.
> 
> When you say "records = [Foo(x, y) for x, y in some_iterable()]", you 
> are creating a list of such objects. This is the thing you 
> are then sorting.
> 
> 
> 
> Does that make sense?
> 
> Regards, E.

Perfect sense. And now that I've confirmed in code that both sorted()
and 
.sort() behave as hoped for with namedtuples, I couldn't be happier.  ;)

The only thing I don't think you have 100% correct is your assertion
that records is a list. And I'm really not sure now that

records.extend(Record._make(row) for row in rows) 

is a list comprehension. 

That's the last statement in the creation of 'records', and immediately
after that statement executes, the type function says the resulting
'records' is a class, probably derived from list, but it's not a
straight up list.

'records' is enough different that you can't assume across the board
that namedtuples created this way are equivalent to a list. You do run
into problems if you assume it behaves like a list, or even like
standard tuples, because it doesn't always. Believe me, when I first
started working with namedtuples, I got plenty snarled up debugging code
that was written assuming list behavior to know that a namedtuple of
namedtuples is not exactly a list. Or even exactly like a list.

But that's just a quibble. The important thing in this context is that
both .sort() and sorted() treat it like a list and DTRT.  And that's
very nice. ;)

Deborah

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Using namedtuples field names for column indices in a list of lists

2017-01-09 Thread Erik


On 10/01/17 00:54, Deborah Swanson wrote:

Since I won't change the order of the records again after the sort, I'm
using

records.sort(key=operator.attrgetter("Description", "Date"))

once, which also works perfectly.

So both sorted() and sort() can be used to sort namedtuples.  Good to
know!


As people keep saying, the object you have called 'records' is a *list* 
of namedtuple objects. It is not a namedtuple.


IIRC, you create it using a list comprehension which creates the 
records. A list comprehension always creates a list.


The sorted() function and the list.sort() method can be used to sort a 
list containing any objects - it's just a case of telling them how to 
obtain the key values to compare (which, in the case of simple attribute 
access which the namedtuple objects allow, "operator.attrgetter()" will 
do that). This is why sorting the list works for you.


You could sort objects of different types - but you might need to supply 
a function instead of operator.attrgetter() which looks at the type of 
each object and returns something that's obtained differently for each 
type (but which the sort function can compare).





When you say 'Foo = namedtuple("Foo", "spam ham")', you are creating a 
"factory" which is able to generate "Foo" objects for you.


When you say "x = Foo(1, 2)" you are using the factory to create an 
object for you which has its "spam" and "ham" attributes set to the 
values 1 and 2 respectively.


When you say "records = [Foo(x, y) for x, y in some_iterable()]", you 
are creating a list of such objects. This is the thing you are then sorting.




Does that make sense?

Regards, E.
--
https://mail.python.org/mailman/listinfo/python-list

RE: Using namedtuples field names for column indices in a list of lists

2017-01-09 Thread Deborah Swanson

Peter Otten wrote, on January 09, 2017 3:27 PM
> 
> While stable sort is nice in this case you can just say
> 
> key=operator.attrgetter("Description", "Date")
> 
> Personally I'd only use sorted() once and then switch to the 
> sort() method.

This works perfectly, thank you.

As I read the docs, the main (only?) difference between sorted() and
.sort() is that .sort() sorts the list in place.

Since I won't change the order of the records again after the sort, I'm
using 

records.sort(key=operator.attrgetter("Description", "Date"))

once, which also works perfectly.

So both sorted() and sort() can be used to sort namedtuples.  Good to
know!

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Using namedtuples field names for column indices in a list of lists

2017-01-09 Thread Peter Otten

breamore...@gmail.com wrote:

> On Monday, January 9, 2017 at 5:34:12 PM UTC, Tim Chase wrote:
>> On 2017-01-09 08:31, breamoreboy wrote:
>> > On Monday, January 9, 2017 at 2:22:19 PM UTC, Tim Chase wrote:
>> > > I usually wrap the iterable in something like
>> > > 
>> > >   def pairwise(it):
>> > > prev = next(it)
>> > > for thing in it:
>> > >   yield prev, thing
>> > >   prev = thing
>> > 
>> > Or from
>> > https://docs.python.org/3/library/itertools.html#itertools-recipes:->> > 
>> > def pairwise(iterable):
>> > "s -> (s0,s1), (s1,s2), (s2, s3), ..."
>> > a, b = tee(iterable)
>> > next(b, None)
>> > return zip(a, b)
>> > 
>> > This and many other recipes are available in the more-itertools
>> > module which is on pypi.
>> 
>> Ah, helpful to not have to do it from scratch each time.  Also, I see
>> several others that I've coded up from scratch (particularly the
>> partition() and first_true() functions).
>> 
>> I usually want to make sure it's tailored for my use cases. The above
>> pairwise() is my most common use case, but I occasionally want N-wise
>> pairing
> 
> def ntuplewise(iterable, n=2):
> args = tee(iterable, n)
> loops = n - 1
> while loops:
> for _ in range(loops):
> next(args[loops], None)
> loops -= 1
> return zip(*args)
> 
>> 
>>   s -> (s0,s1,…sN), (s1,s2,…S{N+1}), (s2,s3,…s{N+2}), …
>> 
>> or to pad them out so either the leader/follower gets *all* of the
>> values, with subsequent values being a padding value:
>> 
>>   # lst = [s0, s1, s2]
>>   (s0,s1), (s1, s2), (s2, PADDING)
> 
> Use zip_longest instead of zip in the example code above.
> 
>>   # or
>>   (PADDING, s0), (s0, s1), (s1, s2)
> 
> Haven't a clue off of the top of my head and I'm too darn tired to think
> about it :)

In both cases modify the iterable before feeding it to ntuplewise():

>>> PADDING = None
>>> N = 3
>>> items = range(5)
>>> list(ntuplewise(chain(repeat(PADDING, N-1), items), N))
[(None, None, 0), (None, 0, 1), (0, 1, 2), (1, 2, 3), (2, 3, 4)]
>>> list(ntuplewise(chain(items, repeat(PADDING, N-1)), N))
[(0, 1, 2), (1, 2, 3), (2, 3, 4), (3, 4, None), (4, None, None)]


-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Using namedtuples field names for column indices in a list of lists

2017-01-09 Thread breamoreboy

On Monday, January 9, 2017 at 5:34:12 PM UTC, Tim Chase wrote:
> On 2017-01-09 08:31, breamoreboy wrote:
> > On Monday, January 9, 2017 at 2:22:19 PM UTC, Tim Chase wrote:
> > > I usually wrap the iterable in something like
> > > 
> > >   def pairwise(it):
> > > prev = next(it)
> > > for thing in it:
> > >   yield prev, thing
> > >   prev = thing
> > 
> > Or from
> > https://docs.python.org/3/library/itertools.html#itertools-recipes:-
> > 
> > def pairwise(iterable):
> > "s -> (s0,s1), (s1,s2), (s2, s3), ..."
> > a, b = tee(iterable)
> > next(b, None)
> > return zip(a, b)
> > 
> > This and many other recipes are available in the more-itertools
> > module which is on pypi. 
> 
> Ah, helpful to not have to do it from scratch each time.  Also, I see
> several others that I've coded up from scratch (particularly the
> partition() and first_true() functions).
> 
> I usually want to make sure it's tailored for my use cases. The above
> pairwise() is my most common use case, but I occasionally want N-wise
> pairing

def ntuplewise(iterable, n=2):
args = tee(iterable, n)
loops = n - 1
while loops:
for _ in range(loops):
next(args[loops], None)
loops -= 1
return zip(*args)

> 
>   s -> (s0,s1,…sN), (s1,s2,…S{N+1}), (s2,s3,…s{N+2}), …
> 
> or to pad them out so either the leader/follower gets *all* of the
> values, with subsequent values being a padding value:
> 
>   # lst = [s0, s1, s2]
>   (s0,s1), (s1, s2), (s2, PADDING)

Use zip_longest instead of zip in the example code above.

>   # or
>   (PADDING, s0), (s0, s1), (s1, s2)

Haven't a clue off of the top of my head and I'm too darn tired to think about 
it :)

> 
> but it's good to have my common cases already coded & tested.
> 
> -tkc

Kindest regards.

Mark Lawrence.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Using namedtuples field names for column indices in a list of lists

2017-01-09 Thread Peter Otten

Rhodri James wrote:

> On 09/01/17 21:40, Deborah Swanson wrote:
>> Peter Otten wrote, on January 09, 2017 6:51 AM
>>>
>>> records = sorted(
>>> set(records),
>>> key=operator.attrgetter("Description")
>>> )
>>
>> Good, this is confirmation that 'sorted()' is the way to go. I want a 2
>> key sort, Description and Date, but I think I can figure out how to do
>> that.
> 
> There's a handy trick that you can use because the sorting algorithm is
> stable.  First, sort on your secondary key.  This will leave the data in
> the wrong order, but objects with the same primary key will be in the
> right order by secondary key relative to each other.
> 
> Then sort on your primary key.  Because the sorting algorithm is stable,
> it won't disturb the relative order of objects with the same primary
> key, giving you the sort that you want!
> 
> So assuming you want your data sorted by date, and then by description
> within the same date, it's just:
> 
> records = sorted(
>  sorted(
>  set(records),
>  key=operator.attrgetter("Description")
>  ),
>  key=operator.attrgetter("Date")
> )

While stable sort is nice in this case you can just say

key=operator.attrgetter("Description", "Date")

Personally I'd only use sorted() once and then switch to the sort() method.


-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Using namedtuples field names for column indices in a list of lists

2017-01-09 Thread Rhodri James


On 09/01/17 21:40, Deborah Swanson wrote:

Peter Otten wrote, on January 09, 2017 6:51 AM


records = sorted(
set(records),
key=operator.attrgetter("Description")
)


Good, this is confirmation that 'sorted()' is the way to go. I want a 2
key sort, Description and Date, but I think I can figure out how to do
that.


There's a handy trick that you can use because the sorting algorithm is 
stable.  First, sort on your secondary key.  This will leave the data in 
the wrong order, but objects with the same primary key will be in the 
right order by secondary key relative to each other.


Then sort on your primary key.  Because the sorting algorithm is stable, 
it won't disturb the relative order of objects with the same primary 
key, giving you the sort that you want!


So assuming you want your data sorted by date, and then by description 
within the same date, it's just:


records = sorted(
sorted(
set(records),
key=operator.attrgetter("Description")
),
key=operator.attrgetter("Date")
)

--
Rhodri James *-* Kynesim Ltd
--
https://mail.python.org/mailman/listinfo/python-list

RE: Using namedtuples field names for column indices in a list of lists

2017-01-09 Thread Deborah Swanson

breamore...@gmail.com wrote, on January 09, 2017 8:32 AM
> 
> On Monday, January 9, 2017 at 2:22:19 PM UTC, Tim Chase wrote:
> > On 2017-01-08 22:58, Deborah Swanson wrote:
> > > 1) I have a section that loops through the sorted data, compares
two 
> > > adjacent rows at a time, and marks one of them for deletion if the

> > > rows are identical.
> > > and my question is whether there's a way to work with two adjacent

> > > rows without using subscripts?
> > 
> > I usually wrap the iterable in something like
> > 
> >   def pairwise(it):
> > prev = next(it)
> > for thing in it:
> >   yield prev, thing
> >   prev = thing
> > 
> >   for prev, cur in pairwise(records):
> > compare(prev, cur)
> > 
> > which I find makes it more readable.
> > 
> > -tkc
> 
> Or from 
> https://docs.python.org/3/library/itertools.ht> ml#itertools-recipes:-
> 
> def pairwise(iterable):
> "s -> (s0,s1), (s1,s2), (s2, s3), ..."
> a, b = tee(iterable)
> next(b, None)
> return zip(a, b)
> 
> This and many other recipes are available in the 
> more-itertools module which is on pypi.

Thanks, I'll keep this since I seem to do pairwise comparisons a lot.
I'm going to try using set or OrderedDict for the current problem, per
Peter's suggestion, but if I can't make that work, this surely will.

-- 
https://mail.python.org/mailman/listinfo/python-list

RE: Using namedtuples field names for column indices in a list of lists

2017-01-09 Thread Deborah Swanson

Tim Chase wrote, on January 09, 2017 6:22 AM
> 
> On 2017-01-08 22:58, Deborah Swanson wrote:
> > 1) I have a section that loops through the sorted data, compares two

> > adjacent rows at a time, and marks one of them for deletion if the 
> > rows are identical.
> > and my question is whether there's a way to work with two adjacent 
> > rows without using subscripts?
> 
> I usually wrap the iterable in something like
> 
>   def pairwise(it):
> prev = next(it)
> for thing in it:
>   yield prev, thing
>   prev = thing
> 
>   for prev, cur in pairwise(records):
> compare(prev, cur)
> 
> which I find makes it more readable.
> 
> -tkc

This looks very useful, and comparing two adjacent rows is something I
do often. Thanks Tim!

Deborah

-- 
https://mail.python.org/mailman/listinfo/python-list

RE: Using namedtuples field names for column indices in a list of lists

2017-01-09 Thread Deborah Swanson

Peter Otten wrote, on January 09, 2017 6:51 AM
> 
> Deborah Swanson wrote:
> 
> > Even better, to get hold of all the records with the same
Description 
> > as the current row, compare them all, mark all but the different
ones 
> > for deletion, and then resume processing the records after the last 
> > one?
> 
> When you look at all fields for deduplication anyway there's no need
to 
> treat one field (Description) specially. Just
> 
> records = set(records)

I haven't worked with sets before, so this would be a good time to
start.

> should be fine. As the initial order is lost* you probably want to
sort 
> afterwards. The code then becomes
> 
> records = sorted(
> set(records), 
> key=operator.attrgetter("Description")
> )

Good, this is confirmation that 'sorted()' is the way to go. I want a 2
key sort, Description and Date, but I think I can figure out how to do
that.

> Now if you want to fill in missing values, you should probably do this

> before deduplication 

That's how my original code was written, to fill in missing values as
the very last thing before saving to csv.

> -- and the complete() function introduced in
>https://mail.python.org/pipermail/python-list/2016-December/717847.html

> can be adapted to work with namedtuples instead of dicts.

Ah, your defaultdict suggestion. Since my original comprows() function
to fill in missing values is now broken after the rest of the code was
rewritten for namedtuples (I just commented it out to test the
namedtuples version), this would be a good time to look at defaultdict.

> (*) If you want to preserve the initial order you can use a 
> collections.OrderedDict instead of the set.

OrderedDict is another thing I haven't used, but would love to, so I
think I'll try both the set and the OrderedDict, and see which one is
best here.

Thanks again Peter, all your help is very much appreciated.

Deborah

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Using namedtuples field names for column indices in a list of lists

2017-01-09 Thread breamoreboy

On Monday, January 9, 2017 at 2:22:19 PM UTC, Tim Chase wrote:
> On 2017-01-08 22:58, Deborah Swanson wrote:
> > 1) I have a section that loops through the sorted data, compares two
> > adjacent rows at a time, and marks one of them for deletion if the
> > rows are identical.
> > 
> > I'm using 
> > 
> > for i in range(len(records)-1):
> > r1 = records[i]
> > r2 = records[i+1]
> > if r1.xx = r2.xx:
> > .
> > .
> > and my question is whether there's a way to work with two adjacent
> > rows without using subscripts?  
> 
> I usually wrap the iterable in something like
> 
>   def pairwise(it):
> prev = next(it)
> for thing in it:
>   yield prev, thing
>   prev = thing
> 
>   for prev, cur in pairwise(records):
> compare(prev, cur)
> 
> which I find makes it more readable.
> 
> -tkc

Or from https://docs.python.org/3/library/itertools.html#itertools-recipes:-

def pairwise(iterable):
"s -> (s0,s1), (s1,s2), (s2, s3), ..."
a, b = tee(iterable)
next(b, None)
return zip(a, b)

This and many other recipes are available in the more-itertools module which is 
on pypi.
-- 
https://mail.python.org/mailman/listinfo/python-list

RE: Using namedtuples field names for column indices in a list of lists

2017-01-09 Thread Peter Otten

Deborah Swanson wrote:

> Even better, to get hold of all the records with the same Description as
> the current row, compare them all, mark all but the different ones for
> deletion, and then resume processing the records after the last one?

When you look at all fields for deduplication anyway there's no need to 
treat one field (Description) specially. Just

records = set(records)

should be fine. As the initial order is lost* you probably want to sort 
afterwards. The code then becomes

records = sorted(
set(records), 
key=operator.attrgetter("Description")
)

Now if you want to fill in missing values, you should probably do this 
before deduplication -- and the complete() function introduced in

https://mail.python.org/pipermail/python-list/2016-December/717847.html

can be adapted to work with namedtuples instead of dicts.

(*) If you want to preserve the initial order you can use a 
collections.OrderedDict instead of the set.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Using namedtuples field names for column indices in a list of lists

2017-01-09 Thread Tim Chase

On 2017-01-08 22:58, Deborah Swanson wrote:
> 1) I have a section that loops through the sorted data, compares two
> adjacent rows at a time, and marks one of them for deletion if the
> rows are identical.
> 
> I'm using 
> 
> for i in range(len(records)-1):
> r1 = records[i]
> r2 = records[i+1]
> if r1.xx = r2.xx:
>   .
>   .
> and my question is whether there's a way to work with two adjacent
> rows without using subscripts?  

I usually wrap the iterable in something like

  def pairwise(it):
prev = next(it)
for thing in it:
  yield prev, thing
  prev = thing

  for prev, cur in pairwise(records):
compare(prev, cur)

which I find makes it more readable.

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list

RE: Using namedtuples field names for column indices in a list of lists

2017-01-09 Thread Deborah Swanson

Steve D'Aprano wrote, on January 09, 2017 3:40 AM
> 
> On Mon, 9 Jan 2017 09:57 pm, Deborah Swanson wrote:
> 
> [...]
> > I think you are replying to my question about sorting a 
> namedtuple, in 
> > this case it's called 'records'.
> > 
> > I think your suggestion works for lists and tuples, and probably 
> > dictionaries. But namedtuples doesn't have a sort function.
> 
> Tuples in general (whether named or not) represent structs or 
> records, where the position of the item is significant. It 
> doesn't usually make sense to sort individual elements of a 
> record or tuple:
> 
> Before sorting: Record(name='George', spouse='', 
> position='Accountant') After sorting:  Record(name='', 
> spouse='Accountant', position='George')
> 
> 
> I think what you want to do is sort a list of records, not 
> each record itself. Or possibly you want to reorder the 
> columns, in which case the easiest way to do that is by 
> editing the CSV file in LibreOffice or Excel or another 
> spreadsheet application.
> If you have a list of records, call .sort() on the list, not 
> the individual records. 
> 

I want to sort a nametuple of records.

I could convert it to a list easy enough, with:

recs = list(records)

and then use the column copying and deleting method I described in my
previous post and use mergesort. This would give me exactly what I had
in my original code, and it would be ok. There's only one step after the
mergesort, and I could do it without the namedtuple, although I'd have
to count column indices for that section, and rewrite them whenever the
columns changed, which was what my original 2-letter codes and the
conversion to namedtuples were both meant to avoid.

So all in all, the best thing would be if there's a way to sort records
as a namedtuple. 

> But if I am wrong, and you absolutely must sort the fields of 
> each record, call the sorted() function, which will copy the 
> fields into a list and sort the list. That is:
> 
> alist = sorted(record)
> 
> -- 
> Steve
> "Cheer up," they said, "things could be worse." So I cheered 
> up, and sure enough, things got worse.

I'm not sure what you mean by sorting the fields of each record. I want
all the rows in records sorted by the Description and Date in each
record.

alist = sorted(record)

looks like it sorts one record, by what? An alphanumeric ordering of all
the fields in record?  That would be beyond useless for my purposes.

It's possible sorted() will work on namedtuples. Stackoverflow has an
example:

from operator import attrgetter
from collections import namedtuple

Person = namedtuple('Person', 'name age score')
seq = [Person(name='nick', age=23, score=100),
   Person(name='bob', age=25, score=200)]
# Sort list by name
sorted(seq, key=attrgetter('name'))
# Sort list by age
sorted(seq, key=attrgetter('age'))

So apparently it's done, and it's a keyed sort too. Although what
they're sorting is a list of namedtuples, which may or may not work on a
single namedtuple made of row (record) namedtuples. I'll definitely try
it tomorrow.

Know any way to convert a list back into a namedtuple? I suppose I could
go through all the steps used at the beginning to make records, but that
seems a waste if there's any way at all to sort the namedtuple without
converting it into a list.

Thanks Steven! Maybe sorted() is my friend here.  (haha, and maybe not.)

Deborah

-- 
https://mail.python.org/mailman/listinfo/python-list

RE: Using namedtuples field names for column indices in a list of lists

2017-01-09 Thread Steve D'Aprano

On Mon, 9 Jan 2017 09:57 pm, Deborah Swanson wrote:

[...]
> I think you are replying to my question about sorting a namedtuple, in
> this case it's called 'records'.
> 
> I think your suggestion works for lists and tuples, and probably
> dictionaries. But namedtuples doesn't have a sort function.

Tuples in general (whether named or not) represent structs or records, where
the position of the item is significant. It doesn't usually make sense to
sort individual elements of a record or tuple:

Before sorting: Record(name='George', spouse='', position='Accountant')
After sorting:  Record(name='', spouse='Accountant', position='George')

I think what you want to do is sort a list of records, not each record
itself. Or possibly you want to reorder the columns, in which case the
easiest way to do that is by editing the CSV file in LibreOffice or Excel
or another spreadsheet application.

If you have a list of records, call .sort() on the list, not the individual
records. 

But if I am wrong, and you absolutely must sort the fields of each record,
call the sorted() function, which will copy the fields into a list and sort
the list. That is:

alist = sorted(record)

-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

-- 
https://mail.python.org/mailman/listinfo/python-list

RE: Using namedtuples field names for column indices in a list of lists

2017-01-09 Thread Deborah Swanson

Antoon Pardon wrote, on January 09, 2017 2:35 AM

> If I understand correctly you want something like:
> 
> records.sort(key = lamda rec: rec.xx)
> 
> AKA 
> 
> from operator import attrgetter
> records.sort(key = attrgetter('xx'))
> 
> or maybe:
> 
> records.sort(key = lambda rec: (rec.xx,) + tuple(rec))
> 
> -- 
> Antoon Pardon

I think you are replying to my question about sorting a namedtuple, in
this case it's called 'records'.

I think your suggestion works for lists and tuples, and probably
dictionaries. But namedtuples doesn't have a sort function.

>>> from collections import namedtuple
>>> dir(namedtuple)
['__annotations__', '__call__', '__class__', '__closure__', '__code__',
'__defaults__', '__delattr__', '__dict__', '__dir__', '__doc__',
'__eq__', '__format__', '__ge__', '__get__', '__getattribute__',
'__globals__', '__gt__', '__hash__', '__init__', '__kwdefaults__',
'__le__', '__lt__', '__module__', '__name__', '__ne__', '__new__',
'__qualname__', '__reduce__', '__reduce_ex__', '__repr__',
'__setattr__', '__sizeof__', '__str__', '__subclasshook__']

so nothing with records.sort will work.  :(

Deborah

-- 
https://mail.python.org/mailman/listinfo/python-list

RE: Using namedtuples field names for column indices in a list of lists

2017-01-09 Thread Deborah Swanson

Antoon Pardon wrote, on January 09, 2017 2:14 AM

> > 1) I have a section that loops through the sorted data, compares two

> > adjacent rows at a time, and marks one of them for deletion if the 
> > rows are identical.
> >
> > I'm using
> >
> > for i in range(len(records)-1):
> > r1 = records[i]
> > r2 = records[i+1]
> > if r1.xx = r2.xx:
> > .
> > .
> > and my question is whether there's a way to work with two adjacent 
> > rows without using subscripts?
> 
> for r1, r2 in zip(records[i], records[i+1]):
> if r1.xx == r2.xx
> .
> .

Ok, I've seen the zip function before and it might do the job, but here
I think you're suggesting:

for i in range(len(records)-1):
for r1, r2 in zip(records[i], records[i+1]):
if r1.xx == r2.xx
.
.

The hope was to do the loop without subscripts, and this may or may not
have gotchas because records is a namedtuple:

for r in records(1:):
.
.

Deborah

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Using namedtuples field names for column indices in a list of lists

2017-01-09 Thread Antoon Pardon

Op 09-01-17 om 07:58 schreef Deborah Swanson:
> Peter Otten wrote, on January 08, 2017 5:21 AM
>> Deborah Swanson wrote:
>>
>>> Peter Otten wrote, on January 08, 2017 3:01 AM
>>  
>> Personally I would recommend against mixing data (an actual location)
> and 
>> metadata (the column name,"Location"), but if you wish my code can be 
>> adapted as follows:
>>
>> infile = open("dictreader_demo.csv")
>> rows = csv.reader(infile)
>> fieldnames = next(rows)
>> Record = namedtuple("Record", fieldnames)
>> records = [Record._make(fieldnames)]
>> records.extend(Record._make(row) for row in rows)
> Works like a charm. I stumbled a bit changing all my subscripted
> variables to namedtuples and rewriting the inevitable places my code
> that didn't work the same. But actually it was fun, especially deleting
> all the sections and variables I no longer needed. And it executes
> correctly now too - with recognizable fieldnames instead of my quirky
> 2-letter code subscripts.  All in all a huge win!
>
> I do have two more questions.
>
> 1) I have a section that loops through the sorted data, compares two
> adjacent rows at a time, and marks one of them for deletion if the rows
> are identical.
>
> I'm using 
>
> for i in range(len(records)-1):
> r1 = records[i]
> r2 = records[i+1]
> if r1.xx = r2.xx:
>   .
>   .
> and my question is whether there's a way to work with two adjacent rows
> without using subscripts?  
>
> Even better, to get hold of all the records with the same Description as
> the current row, compare them all, mark all but the different ones for
> deletion, and then resume processing the records after the last one?

If I understand correctly you want something like:

records.sort(key = lamda rec: rec.xx)

AKA 

from operator import attrgetter
records.sort(key = attrgetter('xx'))

or maybe:

records.sort(key = lambda rec: (rec.xx,) + tuple(rec))

-- 
Antoon Pardon

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Using namedtuples field names for column indices in a list of lists

2017-01-09 Thread Antoon Pardon

Op 09-01-17 om 07:58 schreef Deborah Swanson:
> Peter Otten wrote, on January 08, 2017 5:21 AM
>> Deborah Swanson wrote:
>>
>>> Peter Otten wrote, on January 08, 2017 3:01 AM
>>  
>> Personally I would recommend against mixing data (an actual location)
> and 
>> metadata (the column name,"Location"), but if you wish my code can be 
>> adapted as follows:
>>
>> infile = open("dictreader_demo.csv")
>> rows = csv.reader(infile)
>> fieldnames = next(rows)
>> Record = namedtuple("Record", fieldnames)
>> records = [Record._make(fieldnames)]
>> records.extend(Record._make(row) for row in rows)
> Works like a charm. I stumbled a bit changing all my subscripted
> variables to namedtuples and rewriting the inevitable places my code
> that didn't work the same. But actually it was fun, especially deleting
> all the sections and variables I no longer needed. And it executes
> correctly now too - with recognizable fieldnames instead of my quirky
> 2-letter code subscripts.  All in all a huge win!
>
> I do have two more questions.
>
> 1) I have a section that loops through the sorted data, compares two
> adjacent rows at a time, and marks one of them for deletion if the rows
> are identical.
>
> I'm using 
>
> for i in range(len(records)-1):
> r1 = records[i]
> r2 = records[i+1]
> if r1.xx = r2.xx:
>   .
>   .
> and my question is whether there's a way to work with two adjacent rows
> without using subscripts?  

for r1, r2 in zip(records[i], records[i+1]):
if r1.xx == r2.xx
.
.


-- 
https://mail.python.org/mailman/listinfo/python-list

RE: Using namedtuples field names for column indices in a list of lists

2017-01-08 Thread Deborah Swanson

Steven D'Aprano wrote, on January 08, 2017 7:30 PM
> 
> On Sunday 08 January 2017 20:53, Deborah Swanson wrote:
> 
> > Steven D'Aprano wrote, on January 07, 2017 10:43 PM
> 
> No, I'm pretty sure that's not the case. I don't have access 
> to your CSV file, 
> but I can simulate it:
> 
> ls = [['Location', 'Date', 'Price'],
>   ['here', '1/1/17', '1234'],
>   ['there', '1/1/17', '5678'],
>   ['everywhere', '1/1/17', '9821']
>   ]
> 
> from collections import namedtuple
> lst = namedtuple('lst', ls[0])
> 
> print(type(lst))
> print(lst)
> 
> 
> 
> If you run that code, you should see:
> 
> 
> 
> 
> 
> which contradicts your statement:
> 
> 'lst' is a namedtuple instance with each of the column 
> titles as field names.

Yes, yes. In a careless moment I called a class an instance.

> and explains why you had to access the individual property 
> method `fget`: you 
> were accessing the *class object* rather than an actual named 
> tuple instance.

That code is deleted and long gone now, so I can't look at it in the
debugger, but yes, I'm pretty sure 'fget' is a class member.

> The code you gave was:
> 
> lst.Location.fget(l)
> 
> where l was not given, but I can guess it was a row of the 
> CSV file, i.e. an 
> individual record. So:
> 
> - lst was the named tuple class, a subclass of tuple
> 
> - lst.Location returns a property object
> 
> - lst.Location.fget is the internal fget method of the 
> property object.

And your point is?  Perhaps I didn't express myself in a way that you
could recognize, but I understood all of that before I wrote to you, and
attempted to convey that understanding to you. Obviously I failed, if
you now think I need a lesson in what's going on here.

> I think Peter Otten has the right idea: create a list of 
> records with something 
> like this:
> 
> 
> Record = namedtuple('Record', ls[0])
> data = [Record(*row) for row in ls[1:])
> 
> 
> or if you prefer Peter's version:
> 
> data = [Record._make(row) for row in ls[1:])
> 
> 
> Half the battle is coming up with the right data structures :-)

Can't and wouldn't disagree with any part of that!

> -- 
> Steven
> "Ever since I learned about confirmation bias, I've been seeing 
> it everywhere." - Jon Ronson

-- 
https://mail.python.org/mailman/listinfo/python-list

RE: Using namedtuples field names for column indices in a list of lists

2017-01-08 Thread Deborah Swanson

Peter Otten wrote, on January 08, 2017 5:21 AM
> 
> Deborah Swanson wrote:
> 
> > Peter Otten wrote, on January 08, 2017 3:01 AM
>  
> Personally I would recommend against mixing data (an actual location)
and 
> metadata (the column name,"Location"), but if you wish my code can be 
> adapted as follows:
> 
> infile = open("dictreader_demo.csv")
> rows = csv.reader(infile)
> fieldnames = next(rows)
> Record = namedtuple("Record", fieldnames)
> records = [Record._make(fieldnames)]
> records.extend(Record._make(row) for row in rows)

Works like a charm. I stumbled a bit changing all my subscripted
variables to namedtuples and rewriting the inevitable places my code
that didn't work the same. But actually it was fun, especially deleting
all the sections and variables I no longer needed. And it executes
correctly now too - with recognizable fieldnames instead of my quirky
2-letter code subscripts.  All in all a huge win!

I do have two more questions.

1) I have a section that loops through the sorted data, compares two
adjacent rows at a time, and marks one of them for deletion if the rows
are identical.

I'm using 

for i in range(len(records)-1):
r1 = records[i]
r2 = records[i+1]
if r1.xx = r2.xx:
.
.
and my question is whether there's a way to work with two adjacent rows
without using subscripts?  

Even better, to get hold of all the records with the same Description as
the current row, compare them all, mark all but the different ones for
deletion, and then resume processing the records after the last one?

2) I'm using mergesort. (I didn't see any way to sort a namedtuple in
the docs.) In the list version of my code I copied and inserted the 2
columns I wanted to sort by into the beginning of the list, and then
deleted them after the list was sorted. But just looking at records, I'm
not so sure that can easily be done. I remember your code to work with
columns of the data:

columnA = [record.A for record in records]

and I can see how that would get me columnA and columnB, but then is
there any better way to insert and delete columns in an existing
namedtuple than slicing? And I don't think you can insert or delete a
whole column while slicing.

Or maybe my entire approach is not the best. I know it's possible to do
keyed sorts, but I haven't actually written or used any. So I just
pulled a mergesort off the shelf and got what I wanted by inserting
copies of those 2 columns at the front, and then deleting them when the
sort was complete. Not exactly elegant, but it works.

Any suggestions would be most welcome. 

-- 
https://mail.python.org/mailman/listinfo/python-list

RE: Using namedtuples field names for column indices in a list of lists

2017-01-08 Thread Deborah Swanson

Steven D'Aprano wrote, on January 07, 2017 10:43 PM
> 
> On Sunday 08 January 2017 16:39, Deborah Swanson wrote:
> 
> The recommended way is with the _replace method:
> 
> py> instance._replace(A=999)
> Record(A=999, B=20, C=30)
> py> instance._replace(A=999, C=888)
> Record(A=999, B=20, C=888)
> 
> -- 
> Steven
> "Ever since I learned about confirmation bias, I've been seeing 
> it everywhere." - Jon Ronson

instance._replace(A=999) works perfectly, and editting my existing
assignment statements was really easy.  Thanks - a lot.

-- 
https://mail.python.org/mailman/listinfo/python-list

RE: Using namedtuples field names for column indices in a list of lists

2017-01-08 Thread Steven D'Aprano

On Sunday 08 January 2017 20:53, Deborah Swanson wrote:

> Steven D'Aprano wrote, on January 07, 2017 10:43 PM
>> 
>> On Sunday 08 January 2017 16:39, Deborah Swanson wrote:
>> 
>> > What I've done so far:
>> > 
>> > with open('E:\\Coding projects\\Pycharm\\Moving\\Moving
>> 2017 in.csv',
>> > 'r') as infile:
>> > ls = list(csv.reader(infile))
>> > lst = namedtuple('lst', ls[0])
>> > 
>> > where 'ls[0]' is the header row of the csv, and it works perfectly
>> > well. 'lst' is a namedtuple instance with each of the
>> column titles as
>> > field names.
>> 
>> Are you sure? namedtuple() returns a class, not a list:
> 
> Yes. 'ls' is defined as 'list(csv.reader(infile))', so ls[0] is the
> first row from the csv, the header row. 'lst' is the namedtuple.
> 
> Perhaps what's puzzling you is that the way I've written it, the list of
> data and the namedtuple are disjoint, and that's the problem.

No, I'm pretty sure that's not the case. I don't have access to your CSV file, 
but I can simulate it:

ls = [['Location', 'Date', 'Price'],
  ['here', '1/1/17', '1234'],
  ['there', '1/1/17', '5678'],
  ['everywhere', '1/1/17', '9821']
  ]

from collections import namedtuple
lst = namedtuple('lst', ls[0])

print(type(lst))
print(lst)



If you run that code, you should see:





which contradicts your statement:

'lst' is a namedtuple instance with each of the column 
titles as field names.


and explains why you had to access the individual property method `fget`: you 
were accessing the *class object* rather than an actual named tuple instance.

The code you gave was:

lst.Location.fget(l)

where l was not given, but I can guess it was a row of the CSV file, i.e. an 
individual record. So:

- lst was the named tuple class, a subclass of tuple

- lst.Location returns a property object

- lst.Location.fget is the internal fget method of the property object.


I think Peter Otten has the right idea: create a list of records with something 
like this:


Record = namedtuple('Record', ls[0])
data = [Record(*row) for row in ls[1:])


or if you prefer Peter's version:

data = [Record._make(row) for row in ls[1:])


Half the battle is coming up with the right data structures :-)


-- 
Steven
"Ever since I learned about confirmation bias, I've been seeing 
it everywhere." - Jon Ronson

-- 
https://mail.python.org/mailman/listinfo/python-list

RE: Using namedtuples field names for column indices in a list of lists

2017-01-08 Thread Deborah Swanson

Paul Rudin wrote, on January 08, 2017 6:49 AM
> 
> "Deborah Swanson"  writes:
> 
> > Peter Otten wrote, on January 08, 2017 3:01 AM
> >> 
> >> columnA = [record.A for record in records]
> >
> > This is very neat. Something like a list comprehension for named 
> > tuples?
> 
> Not something like - this *is* a list comprehension - it 
> creates a list of named tuples.
> 
> The thing you iterate over within the comprehension can be 
> any iterator. (Of course you're going to run into problems if 
> you try to construct a list from an infinite iterator.)

Thanks Paul. I've been meaning to spend some time getting to thoroughly
know list comprehensions for awhile now, but I keep running into so many
new things I just haven't gotten to it. I thought it looked like one,
but I hedged my wording because I wasn't sure.  Infinite iterators
definitely sound like something to remember!

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Using namedtuples field names for column indices in a list of lists

2017-01-08 Thread Paul Rudin

"Deborah Swanson"  writes:

> Peter Otten wrote, on January 08, 2017 3:01 AM
>> 
>> columnA = [record.A for record in records]
>
> This is very neat. Something like a list comprehension for named tuples?

Not something like - this *is* a list comprehension - it creates a list
of named tuples.

The thing you iterate over within the comprehension can be any
iterator. (Of course you're going to run into problems if you try to
construct a list from an infinite iterator.)
-- 
https://mail.python.org/mailman/listinfo/python-list

RE: Using namedtuples field names for column indices in a list of lists

2017-01-08 Thread Deborah Swanson

Peter Otten wrote, on January 08, 2017 5:21 AM
> 
> Deborah Swanson wrote:
> 
> > Peter Otten wrote, on January 08, 2017 3:01 AM
> >> 
> >> Deborah Swanson wrote:
> >> 
> >> > to do that is with .fget(). Believe me, I tried every > possible 
> >> > way
> > to
> >> > use instance.A or instance[1] and no way could I get 
> >> > ls[instance.A].
> >> 
> >> Sorry, no.
> > 
> > I quite agree, I was describing the dead end I was in from 
> peeling the 
> > list of data and the namedtuple from the header row off the csv 
> > separately. That was quite obviously the wrong path to take, but I 
> > didn't know what a good way would be.
> > 
> >> To get a list of namedtuple instances use:
> >> 
> >> rows = csv.reader(infile)
> >> Record = namedtuple("Record", next(rows))
> >> records = [Record._make(row) for row in rows]
> > 
> > This is slightly different from Steven's suggestion, and it makes a 
> > block of records that I think would be iterable. At any 
> rate all the 
> > data from the csv would belong to a single data structure, and that 
> > seems inherently a good thing.
> > 
> > a = records[i].A , for example
> > 
> > And I think that this would produce recognizable field names in my 
> > code (which was the original goal) if the following works:
> > 
> > records[0] is the header row == ('Description', 'Location', etc.)
> 
> Personally I would recommend against mixing data (an actual 
> location) and 
> metadata (the column name,"Location"), but if you wish my code can be 
> adapted as follows:
> 
> infile = open("dictreader_demo.csv")
> rows = csv.reader(infile)
> fieldnames = next(rows)
> Record = namedtuple("Record", fieldnames)
> records = [Record._make(fieldnames)]
> records.extend(Record._make(row) for row in rows)

Peter, this looks really good, and yes, I didn't feel so good about 
records[i].Location either, but it was the only way I could see to get
the recognizable variable names I want. By extending records from a
namedtuple of field names, I think it can be done cleanly. I'll try it
and see.

> If you want a lot of flexibility without doing the legwork 
> yourself you 
> might also have a look at pandas. Example session:
> 
> $ cat places.csv
> Location,Description,Size
> here,something,17
> there,something else,10
> $ python3
> Python 3.4.3 (default, Nov 17 2016, 01:08:31) 
> [GCC 4.8.4] on linux
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import pandas
> >>> places = pandas.read_csv("places.csv")
> >>> places
>   Location Description  Size
> 0 here   something17
> 1there  something else10
> 
> [2 rows x 3 columns]
> >>> places.Location
> 0 here
> 1there
> Name: Location, dtype: object
> >>> places.sort(columns="Size")
>   Location Description  Size
> 1there  something else10
> 0 here   something17
> 
> [2 rows x 3 columns]
> >>> places.Size.mean()
> 13.5
> 
> Be aware that there is a learning curve...

Yes, and I'm sure the learning curve is steep. I watched a webinar on
pandas about a year ago, not to actually learn it, but just to take in
the big picture and see something people were really accomplishing with
python.  I won't take this on any time right away, but I'll definitely
keep it and work with it sometime. Maybe as just an intro to pandas,
using my data from the real estate project.


> > If I can use records[i].Location for the Location column 
> data in row 
> > 'i', then I've got my recognizable-field-name variables.
> > 
> >> If you want a column from a list of records you need to extract it 
> >> manually:
> >> 
> >> columnA = [record.A for record in records]
> > 
> > This is very neat. Something like a list comprehension for named 
> > tuples?
> > 
> > Thanks Peter, I'll try it all tomorrow and see how it goes.
> > 
> > PS. I haven't forgotten your defaultdict suggestion, I'm 
> just taking 
> > the suggestions I got in the "Cleaning up Conditionals" 
> thread one at 
> > a time, and I will get to defaultdict. Then I'll look at 
> all of them 
> > and see what final version of the code will work best with all the 
> > factors to consider.

-- 
https://mail.python.org/mailman/listinfo/python-list

RE: Using namedtuples field names for column indices in a list of lists

2017-01-08 Thread Peter Otten

Deborah Swanson wrote:

> Peter Otten wrote, on January 08, 2017 3:01 AM
>> 
>> Deborah Swanson wrote:
>> 
>> > to do that is with .fget(). Believe me, I tried every > possible way
> to
>> > use instance.A or instance[1] and no way could I get ls[instance.A].
>> 
>> Sorry, no.
> 
> I quite agree, I was describing the dead end I was in from peeling the
> list of data and the namedtuple from the header row off the csv
> separately. That was quite obviously the wrong path to take, but I
> didn't know what a good way would be.
> 
>> To get a list of namedtuple instances use:
>> 
>> rows = csv.reader(infile)
>> Record = namedtuple("Record", next(rows))
>> records = [Record._make(row) for row in rows]
> 
> This is slightly different from Steven's suggestion, and it makes a
> block of records that I think would be iterable. At any rate all the
> data from the csv would belong to a single data structure, and that
> seems inherently a good thing.
> 
> a = records[i].A , for example
> 
> And I think that this would produce recognizable field names in my code
> (which was the original goal) if the following works:
> 
> records[0] is the header row == ('Description', 'Location', etc.)

Personally I would recommend against mixing data (an actual location) and 
metadata (the column name,"Location"), but if you wish my code can be 
adapted as follows:

infile = open("dictreader_demo.csv")
rows = csv.reader(infile)
fieldnames = next(rows)
Record = namedtuple("Record", fieldnames)
records = [Record._make(fieldnames)]
records.extend(Record._make(row) for row in rows)

If you want a lot of flexibility without doing the legwork yourself you 
might also have a look at pandas. Example session:

$ cat places.csv
Location,Description,Size
here,something,17
there,something else,10
$ python3
Python 3.4.3 (default, Nov 17 2016, 01:08:31) 
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas
>>> places = pandas.read_csv("places.csv")
>>> places
  Location Description  Size
0 here   something17
1there  something else10

[2 rows x 3 columns]
>>> places.Location
0 here
1there
Name: Location, dtype: object
>>> places.sort(columns="Size")
  Location Description  Size
1there  something else10
0 here   something17

[2 rows x 3 columns]
>>> places.Size.mean()
13.5

Be aware that there is a learning curve...
 
> If I can use records[i].Location for the Location column data in row
> 'i', then I've got my recognizable-field-name variables.
> 
>> If you want a column from a list of records you need to
>> extract it manually:
>> 
>> columnA = [record.A for record in records]
> 
> This is very neat. Something like a list comprehension for named tuples?
> 
> Thanks Peter, I'll try it all tomorrow and see how it goes.
> 
> PS. I haven't forgotten your defaultdict suggestion, I'm just taking the
> suggestions I got in the "Cleaning up Conditionals" thread one at a
> time, and I will get to defaultdict. Then I'll look at all of them and
> see what final version of the code will work best with all the factors
> to consider.


-- 
https://mail.python.org/mailman/listinfo/python-list

RE: Using namedtuples field names for column indices in a list of lists

2017-01-08 Thread Deborah Swanson

Peter Otten wrote, on January 08, 2017 3:01 AM
> 
> Deborah Swanson wrote:
> 
> > to do that is with .fget(). Believe me, I tried every > possible way
to 
> > use instance.A or instance[1] and no way could I get ls[instance.A].
> 
> Sorry, no.

I quite agree, I was describing the dead end I was in from peeling the
list of data and the namedtuple from the header row off the csv
separately. That was quite obviously the wrong path to take, but I
didn't know what a good way would be.

> To get a list of namedtuple instances use:
> 
> rows = csv.reader(infile)
> Record = namedtuple("Record", next(rows))
> records = [Record._make(row) for row in rows]

This is slightly different from Steven's suggestion, and it makes a
block of records that I think would be iterable. At any rate all the
data from the csv would belong to a single data structure, and that
seems inherently a good thing.

a = records[i].A , for example

And I think that this would produce recognizable field names in my code
(which was the original goal) if the following works:

records[0] is the header row == ('Description', 'Location', etc.)

If I can use records[i].Location for the Location column data in row
'i', then I've got my recognizable-field-name variables.  

> If you want a column from a list of records you need to 
> extract it manually:
> 
> columnA = [record.A for record in records]

This is very neat. Something like a list comprehension for named tuples?

Thanks Peter, I'll try it all tomorrow and see how it goes.

PS. I haven't forgotten your defaultdict suggestion, I'm just taking the
suggestions I got in the "Cleaning up Conditionals" thread one at a
time, and I will get to defaultdict. Then I'll look at all of them and
see what final version of the code will work best with all the factors
to consider. 

-- 
https://mail.python.org/mailman/listinfo/python-list

RE: Using namedtuples field names for column indices in a list of lists

2017-01-08 Thread Peter Otten

Deborah Swanson wrote:

> to do that is with .fget(). Believe me, I tried every possible way to
> use instance.A or instance[1] and no way could I get ls[instance.A].

Sorry, no.

To get a list of namedtuple instances use:

rows = csv.reader(infile)
Record = namedtuple("Record", next(rows))
records = [Record._make(row) for row in rows]

If you want a column from a list of records you need to extract it manually:

columnA = [record.A for record in records]


-- 
https://mail.python.org/mailman/listinfo/python-list

RE: Using namedtuples field names for column indices in a list of lists

2017-01-08 Thread Deborah Swanson

Steven D'Aprano wrote, on January 07, 2017 10:43 PM
> 
> On Sunday 08 January 2017 16:39, Deborah Swanson wrote:
> 
> > What I've done so far:
> > 
> > with open('E:\\Coding projects\\Pycharm\\Moving\\Moving 
> 2017 in.csv',
> > 'r') as infile:
> > ls = list(csv.reader(infile))
> > lst = namedtuple('lst', ls[0])
> > 
> > where 'ls[0]' is the header row of the csv, and it works perfectly 
> > well. 'lst' is a namedtuple instance with each of the 
> column titles as 
> > field names.
> 
> Are you sure? namedtuple() returns a class, not a list:

Yes. 'ls' is defined as 'list(csv.reader(infile))', so ls[0] is the
first row from the csv, the header row. 'lst' is the namedtuple.

Perhaps what's puzzling you is that the way I've written it, the list of
data and the namedtuple are disjoint, and that's the problem.

> py> from collections import namedtuple
> py> names = ['A', 'B', 'C']
> py> namedtuple('lst', names)
> 
> 
> The way namedtuple() is intended to be used is like this:
> 
> 
> py> from collections import namedtuple
> py> names = ['A', 'B', 'C']
> py> Record = namedtuple('Record', names)
> py> instance = Record(10, 20, 30)
> py> print(instance)
> Record(A=10, B=20, C=30)
> 
> 
> There is no need to call fget directly to access the 
> individual fields:
> 
> py> instance.A
> 10
> py> instance.B
> 20
> py> instance[1]  # indexing works too
> 20
> 
> 
> which is *much* simpler than:
> 
> py> Record.A.fget(instance)
> 10

I don't disagree with anything you've said and shown here. But I want to
use the 'instance.A' as a subscript for the list 'ls', and the only way
to do that is with .fget(). Believe me, I tried every possible way to
use instance.A or instance[1] and no way could I get ls[instance.A]. 

The problem I'm having here is one of linkage between the named tuple
for the column titles and the list that holds the data in the columns.

> I think you should be doing something like this:
> 
> pathname = 'E:\\Coding projects\\Pycharm\\Moving\\Moving 2017 
> in.csv' with open(pathname, 'r') as infile:
> rows = list(csv.reader(infile))
> Record = namedtuple("Record", rows[0])
> for row in rows[1:]:  # skip the first row, the header
> row = Record(row)
> # process this row...
> if row.location == 0:
> ...

Now here you have something I didn't think of: 'row = Record(row)' in a
loop through the rows. 

> [...]
> > But I haven't found a way to assign new values to a list element. 
> > using namedtuple.fieldname. I think a basic problem is that 
> > namedtuples have the properties of tuples, and you can't assign to
an 
> > existing tuple because they're immutable.
> 
> Indeed. Being tuples, you have to create a new one. You can 
> do it with slicing, 
> like ordinary tuples, but that's rather clunky:
> 
> py> print(instance)
> Record(A=10, B=20, C=30)
> py> Record(999, *instance[1:])
> Record(A=999, B=20, C=30)

Very clunky. I don't like modifying standard tuples with slicing, and
this is even worse.

> The recommended way is with the _replace method:
> 
> py> instance._replace(A=999)
> Record(A=999, B=20, C=30)
> py> instance._replace(A=999, C=888)
> Record(A=999, B=20, C=888)
> 
> 
> Note that despite the leading underscore, _replace is *not* a 
> private method of 
> the class. It is intentionally documented as public. The 
> leading underscore is 
> so that it won't clash with any field names.
> 
> 
> 
> 
> -- 
> Steven
> "Ever since I learned about confirmation bias, I've been seeing 
> it everywhere." - Jon Ronson

I will have to work with this. It's entirely possible it will do what I
want it to do. The key problem I was having was getting a linkage
between the namedtuple and the list of data from the csv.

I want to implement a suggestion I got to use a namedtuple made from the
header row as subscripts for elements in the list of data, and the
example given in the docs: 

EmployeeRecord = namedtuple('EmployeeRecord', 'name, age, title,
department, paygrade')

import csv
for emp in map(EmployeeRecord._make, csv.reader(open("employees.csv",
"rb"))):
print(emp.name, emp.title)

assumes the field names will be hardcoded. Reading the csv into a list
and then trying to use the namedtuple made from the header row as
subscripts is how I ended up resorting to 'Record.A.fget(instance)' to
read values, and wasn't able to assign them. 

But assigning the rows of data into namedtuple instances with: 

Record = namedtuple("Record", rows[0])
for row in rows[1:]: 
row = Record(row)

does look like the linkage I need and wasn't finding the way I was doing
it. If 'Record(row)' is the list data and the columns are the same as
defined in 'namedtuple("Record", rows[0])', it really should work. And I
didn't get it that _replace could be used to assign new values to
namedtuples (duh. Pretty clear now that I reread it, and all the row
data is in namedtuple instances.) 

The big question is whether the namedtuple instances can be used as
something recognizable as

Re: Using namedtuples field names for column indices in a list of lists

2017-01-07 Thread Steven D'Aprano

On Sunday 08 January 2017 16:39, Deborah Swanson wrote:

> What I've done so far:
> 
> with open('E:\\Coding projects\\Pycharm\\Moving\\Moving 2017 in.csv',
> 'r') as infile:
> ls = list(csv.reader(infile))
> lst = namedtuple('lst', ls[0])
> 
> where 'ls[0]' is the header row of the csv, and it works perfectly well.
> 'lst' is a namedtuple instance with each of the column titles as field
> names.

Are you sure? namedtuple() returns a class, not a list:

py> from collections import namedtuple
py> names = ['A', 'B', 'C']
py> namedtuple('lst', names)


The way namedtuple() is intended to be used is like this:


py> from collections import namedtuple
py> names = ['A', 'B', 'C']
py> Record = namedtuple('Record', names)
py> instance = Record(10, 20, 30)
py> print(instance)
Record(A=10, B=20, C=30)


There is no need to call fget directly to access the individual fields:

py> instance.A
10
py> instance.B
20
py> instance[1]  # indexing works too
20


which is *much* simpler than:

py> Record.A.fget(instance)
10



I think you should be doing something like this:

pathname = 'E:\\Coding projects\\Pycharm\\Moving\\Moving 2017 in.csv'
with open(pathname, 'r') as infile:
rows = list(csv.reader(infile))
Record = namedtuple("Record", rows[0])
for row in rows[1:]:  # skip the first row, the header
row = Record(row)
# process this row...
if row.location == 0:
...

[...]
> But I haven't found a way to assign new values to a list element. using
> namedtuple.fieldname. I think a basic problem is that namedtuples have
> the properties of tuples, and you can't assign to an existing tuple
> because they're immutable.

Indeed. Being tuples, you have to create a new one. You can do it with slicing, 
like ordinary tuples, but that's rather clunky:

py> print(instance)
Record(A=10, B=20, C=30)
py> Record(999, *instance[1:])
Record(A=999, B=20, C=30)


The recommended way is with the _replace method:

py> instance._replace(A=999)
Record(A=999, B=20, C=30)
py> instance._replace(A=999, C=888)
Record(A=999, B=20, C=888)


Note that despite the leading underscore, _replace is *not* a private method of 
the class. It is intentionally documented as public. The leading underscore is 
so that it won't clash with any field names.




-- 
Steven
"Ever since I learned about confirmation bias, I've been seeing 
it everywhere." - Jon Ronson

-- 
https://mail.python.org/mailman/listinfo/python-list

48 matches

Mail list logo