RE: Namedtuples problem

2017-02-23 Thread Deborah Swanson
Gregory Ewing wrote, on February 23, 2017 9:07 PM
> 
> Deborah Swanson wrote:
> > I've run into this kind of problem with namedtuples before, 
> trying to 
> > access field values with variable names, like:
> > 
> > label = 'Location'
> > records.label
> 
> If you need to access an attribute whose name is computed
> at run time, you want getattr() and setattr():
> 
> value = getattr(record, label)
> 
> setattr(record, label, value)
> 
> Using these, I suspect you'll be able to eliminate all
> the stuff you've got that messes around with indexes to
> access field values, making the code a lot simpler and
> easier to follow.
> 
> -- 
> Greg

Steve D'Aprano mentioned using getattr(record, label) and I thought that
might be the way to untangle a lot of the mess I made. Been futzing
around with different ways to use it, but I'll look at setattr(record,
label, value) first thing in the morning. 

Thanks bunches!

Deborah

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Namedtuples problem

2017-02-23 Thread Gregory Ewing

Deborah Swanson wrote:

I've run into this kind of problem with namedtuples before, trying to
access field values with variable names, like:

label = 'Location'
records.label


If you need to access an attribute whose name is computed
at run time, you want getattr() and setattr():

   value = getattr(record, label)

   setattr(record, label, value)

Using these, I suspect you'll be able to eliminate all
the stuff you've got that messes around with indexes to
access field values, making the code a lot simpler and
easier to follow.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list


RE: Namedtuples problem

2017-02-23 Thread Deborah Swanson
Erik wrote, on February 23, 2017 4:03 PM
> 
> On 23/02/17 23:00, Deborah Swanson wrote:
> >> It looks to me like you are indexing into a single-element 
> list that 
> >> you are creating using the literal list syntax in the middle of
> >> the expression.
> >
> > Actually, group is essentially a 2-element list. Each group 
> has a list 
> > of rows, and each row has a set of fields. group has to be 
> indexed by 
> > row index and field index. (This is a namedtuple configuration.)
> >
> > The weirdness is that
> >
> > group[0][4]
> >
> > gets the right answer, but
> >
> > group[[idx][records_idx[label]]],
> > where idx = 0 and records_idx[label]] = 4
> >
> > gets the IndexError.
> 
> So remove the outermost square brackets then so the two expressions
are 
> the same (what I - and also Steven - mentioned is correct: you are 
> creating a single-element list and indexing it (and then using the 
> result of that, should it work, to index 'group')).

I see that clearly now, I just couldn't understand it from what you and
Steven were saying. 

> The same thing as "group[0][4]" in your example is:
> 
> group[idx][records_idx[label]]
> 
> (assuming you have all those things correct - I haven't studied your 
> code in a lot of detail).

You have it exactly correct, so no code study required. ;)

> Your new example expressed using the original construct you posted is:
> 
> group[[0][4]]
> 
> ... see the extra enclosing square brackets?
> 
> E.

Yes, I see them now, and deleting them makes everything work like it
should.

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Namedtuples problem

2017-02-23 Thread Deborah Swanson
Irv Kalb wrote, on February 23, 2017 3:51 PM
> 
> > On Feb 23, 2017, at 3:00 PM, Deborah Swanson 
> >  > wrote:
> > 
> > The weirdness is that
> > 
> > group[0][4]
> > 
> > gets the right answer, but
> > 
> > group[[idx][records_idx[label]]],
> > where idx = 0 and records_idx[label]] = 4
> 
> 
> If that's the case, then I think you need this instead:
> 
> group[idx][records_idx[label]]
> 
> Irv

Yes, that's exactly right and it works in my running code. Three of you
have now spotted the problem (and explained it in terms I can
recognize). Thank you all!

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Namedtuples problem

2017-02-23 Thread Deborah Swanson
Ben Bacarisse wrote, on February 23, 2017 4:24 PM
> 
> "Deborah Swanson" <pyt...@deborahswanson.net> writes:
> 
> >> -Original Message-
> >> From: Erik [mailto:pyt...@lucidity.plus.com]
> >> Sent: Thursday, February 23, 2017 2:09 AM
> >> To: pyt...@deborahswanson.net; python-list@python.org
> >> Subject: Re: Namedtuples problem
> >> 
> >> 
> >> Hi,
> >> 
> >> On 23/02/17 09:38, Deborah Swanson wrote:
> >> > group[[idx][records_idx[label]]]
> >> > gets an IndexError: list index out of range
> >> 
> >> [snip]
> >> 
> >> > Can anyone see why I'm getting this Index error? and how 
> to fix it?
> >> 
> >> It looks to me like you are indexing into a single-element
> >> list that you 
> >> are creating using the literal list syntax in the middle of 
> >> the expression.
> >
> > Actually, group is essentially a 2-element list. Each group 
> has a list 
> > of rows, and each row has a set of fields. group has to be 
> indexed by 
> > row index and field index. (This is a namedtuple configuration.)
> >
> > The weirdness is that
> >
> > group[0][4]
> 
> Sure.  No trouble there.
> 
> > gets the right answer, but
> >
> > group[[idx][records_idx[label]]],
> > where idx = 0 and records_idx[label]] = 4
> >
> > gets the IndexError.
> 
> group is not involved in the index error.  Just write
> 
>   [idx][records_idx[label]]
> 
> and you'll get the error.  Wrapping that up in group[...] 
> won't make a difference.  In fact, since idx is 0 and 
> records_idx[label]] is 4, you get the same error (for the 
> same reason) by typing
> 
>   [0][4]
> 
> into a Python REPL.  Does it still seem weird?

No, and you also put your finger on (psrt of) the real problem. See the
reply I just sent to MRAB.

 
> This is just another shot at explaining the issue because, 
> sometimes, different words can help.  I thought the message 
> you are replying to (as well as others) explained it well, 
> but it obviously it did not hit the spot.
> 
> 
> -- 
> Ben

No, the spot wasn't getting hit, but it has been now. Thanks for
replying.

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Namedtuples problem

2017-02-23 Thread Deborah Swanson
MRAB wrote, on Thursday, February 23, 2017 4:06 PM
> 
> On 2017-02-23 23:00, Deborah Swanson wrote:
> >
> >
> >> -Original Message-
> >> From: Erik [mailto:pyt...@lucidity.plus.com]
> >> Sent: Thursday, February 23, 2017 2:09 AM
> >> To: pyt...@deborahswanson.net; python-list@python.org
> >> Subject: Re: Namedtuples problem
> >>
> >>
> >> Hi,
> >>
> >> On 23/02/17 09:38, Deborah Swanson wrote:
> >> > group[[idx][records_idx[label]]]
> >> > gets an IndexError: list index out of range
> >>
> >> [snip]
> >>
> >> > Can anyone see why I'm getting this Index error? and how 
> to fix it?
> >>
> >> It looks to me like you are indexing into a single-element 
> list that 
> >> you are creating using the literal list syntax in the middle of
> >> the expression.
> >
> > Actually, group is essentially a 2-element list. Each group 
> has a list 
> > of rows, and each row has a set of fields. group has to be 
> indexed by 
> > row index and field index. (This is a namedtuple configuration.)
> >
> > The weirdness is that
> >
> > group[0][4]
> >
> > gets the right answer, but
> >
> > group[[idx][records_idx[label]]],
> > where idx = 0 and records_idx[label]] = 4
> >
> > gets the IndexError.
> >
> It's not weird. As Erik says below, and I'll repeat here, you have:
> 
>  group[[idx][records_idx[label]]]
> 
> That consists of the expression:
> 
>  [idx][records_idx[label]]
> 
> within:
> 
>  group[...]
> 
> The [idx] creates a one-element list.

Yes, group[idx] gives you a row from group
 
> You then try to subscript it with records_idx[label]. If 
> records_idx[label] is anything other than 0 (or -1), it'll raise 
> IndexError because there's only one element in that list.

No, there are currently 17 elements in a record that records_idx[label]
could be referring to in my running code, depending on which field
'label' refers to. Please see the breakdown of what a typical group
looks like that I gave Steve. If you can't find it, let me know and I'll
repeat it.
 
> If you substitute idx == 0 and records_idx[label]] == 4 into:
> 
>  group[[idx][records_idx[label]]]
> 
> you'll get:
> 
>  group[[0][4]]
> 
> which is not the same thing as group[0][4]!

No it isn't! And that's probably where the mysterious IndexError is
coming from! Sickening how crosseyed I can get working with these things
for too long at a stretch.

And in fact, deleting that outer set of square brackets not only get's
rid of the IndexError, but group[idx][records_idx[label]] now fers to
the correct set of field values I was trying to access.

Thank you, thank you!



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Namedtuples problem

2017-02-23 Thread Irv Kalb

> On Feb 23, 2017, at 3:00 PM, Deborah Swanson  > wrote:
> 
> The weirdness is that 
> 
> group[0][4] 
> 
> gets the right answer, but
> 
> group[[idx][records_idx[label]]], 
> where idx = 0 and records_idx[label]] = 4


If that's the case, then I think you need this instead:

group[idx][records_idx[label]]

Irv


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Namedtuples problem

2017-02-23 Thread Ben Bacarisse
"Deborah Swanson" <pyt...@deborahswanson.net> writes:

>> -Original Message-
>> From: Erik [mailto:pyt...@lucidity.plus.com] 
>> Sent: Thursday, February 23, 2017 2:09 AM
>> To: pyt...@deborahswanson.net; python-list@python.org
>> Subject: Re: Namedtuples problem
>> 
>> 
>> Hi,
>> 
>> On 23/02/17 09:38, Deborah Swanson wrote:
>> > group[[idx][records_idx[label]]]
>> > gets an IndexError: list index out of range
>> 
>> [snip]
>> 
>> > Can anyone see why I'm getting this Index error? and how to fix it?
>> 
>> It looks to me like you are indexing into a single-element 
>> list that you 
>> are creating using the literal list syntax in the middle of 
>> the expression.
>
> Actually, group is essentially a 2-element list. Each group has a list
> of rows, and each row has a set of fields. group has to be indexed by
> row index and field index. (This is a namedtuple configuration.)
>
> The weirdness is that 
>
> group[0][4]

Sure.  No trouble there.

> gets the right answer, but
>
> group[[idx][records_idx[label]]], 
> where idx = 0 and records_idx[label]] = 4
>
> gets the IndexError.

group is not involved in the index error.  Just write

  [idx][records_idx[label]]

and you'll get the error.  Wrapping that up in group[...] won't make a
difference.  In fact, since idx is 0 and records_idx[label]] is 4, you
get the same error (for the same reason) by typing

  [0][4]

into a Python REPL.  Does it still seem weird?

This is just another shot at explaining the issue because, sometimes,
different words can help.  I thought the message you are replying to (as
well as others) explained it well, but it obviously it did not hit the
spot.


-- 
Ben.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Namedtuples problem

2017-02-23 Thread MRAB

On 2017-02-23 23:00, Deborah Swanson wrote:




-Original Message-
From: Erik [mailto:pyt...@lucidity.plus.com]
Sent: Thursday, February 23, 2017 2:09 AM
To: pyt...@deborahswanson.net; python-list@python.org
Subject: Re: Namedtuples problem


Hi,

On 23/02/17 09:38, Deborah Swanson wrote:
> group[[idx][records_idx[label]]]
> gets an IndexError: list index out of range

[snip]

> Can anyone see why I'm getting this Index error? and how to fix it?

It looks to me like you are indexing into a single-element
list that you
are creating using the literal list syntax in the middle of
the expression.


Actually, group is essentially a 2-element list. Each group has a list
of rows, and each row has a set of fields. group has to be indexed by
row index and field index. (This is a namedtuple configuration.)

The weirdness is that

group[0][4]

gets the right answer, but

group[[idx][records_idx[label]]],
where idx = 0 and records_idx[label]] = 4

gets the IndexError.


It's not weird. As Erik says below, and I'll repeat here, you have:

group[[idx][records_idx[label]]]

That consists of the expression:

[idx][records_idx[label]]

within:

group[...]

The [idx] creates a one-element list.

You then try to subscript it with records_idx[label]. If 
records_idx[label] is anything other than 0 (or -1), it'll raise 
IndexError because there's only one element in that list.


If you substitute idx == 0 and records_idx[label]] == 4 into:

group[[idx][records_idx[label]]]

you'll get:

group[[0][4]]

which is not the same thing as group[0][4]!


If we were to break the expression into parts to make it a
bit simpler
to refer to discuss:

ridx = records_idx[label]
group[[idx][ridx]]

You can now more easily see that 'group' is being indexed by the
expression "[idx][ridx]". What does that mean?

[idx] is creating a single-element list using literal list
syntax. This
is then indexed using 'ridx' (using, perhaps confusingly, the
exact same
syntax to do a different thing).

The result of *that* expression is then being used to index
'group', but
it won't get that far because you'll get the exception if 'ridx' is
anything but zero.

So the initial problem at least is the extra [] around 'idx' which is
creating a list on the fly for you.



--
https://mail.python.org/mailman/listinfo/python-list


Re: Namedtuples problem

2017-02-23 Thread Erik

On 23/02/17 23:00, Deborah Swanson wrote:

It looks to me like you are indexing into a single-element
list that you
are creating using the literal list syntax in the middle of
the expression.


Actually, group is essentially a 2-element list. Each group has a list
of rows, and each row has a set of fields. group has to be indexed by
row index and field index. (This is a namedtuple configuration.)

The weirdness is that

group[0][4]

gets the right answer, but

group[[idx][records_idx[label]]],
where idx = 0 and records_idx[label]] = 4

gets the IndexError.


So remove the outermost square brackets then so the two expressions are 
the same (what I - and also Steven - mentioned is correct: you are 
creating a single-element list and indexing it (and then using the 
result of that, should it work, to index 'group')).


The same thing as "group[0][4]" in your example is:

group[idx][records_idx[label]]

(assuming you have all those things correct - I haven't studied your 
code in a lot of detail).


Your new example expressed using the original construct you posted is:

group[[0][4]]

... see the extra enclosing square brackets?

E.
--
https://mail.python.org/mailman/listinfo/python-list


Re: Namedtuples problem

2017-02-23 Thread MRAB

On 2017-02-23 22:51, Deborah Swanson wrote:

Peter Otten wrote, on February 23, 2017 2:34 AM

[snip]

#untested

def split_into_groups(records, key):
groups = defaultdict(list)
for record in records:
# no need to check if a group already exists
# an empty list will automatically added for every
# missing key
groups[key(record)].append(record)
return groups


I used this approach the first time I tried this for both defaultdict
and OrderedDict, and for both of them I immediately got a KeyError for
the first record. groups is empty, so the title for the first record
wouldn't already be in groups.

If the key isn't already in the dictionary, a defaultdict will create 
the entry whereas a normal dict will raise KeyError.



Just to check, I commented out the extra lines that I added to handle
new keys in my code and immediately got the same KeyError.

My guess is that while standard dictionaries will automatically make a
new key if it isn't found in the dict, defaultdict and OrderedDict will
not. So it seems you need to handle new keys yourself. Unless you think
I'm doing something wrong and dicts from collections should also
automatically make new keys.


defaultdict will, dict and OrderedDict won't.

[snip]

--
https://mail.python.org/mailman/listinfo/python-list


RE: Namedtuples problem

2017-02-23 Thread Deborah Swanson


> -Original Message-
> From: Erik [mailto:pyt...@lucidity.plus.com] 
> Sent: Thursday, February 23, 2017 2:09 AM
> To: pyt...@deborahswanson.net; python-list@python.org
> Subject: Re: Namedtuples problem
> 
> 
> Hi,
> 
> On 23/02/17 09:38, Deborah Swanson wrote:
> > group[[idx][records_idx[label]]]
> > gets an IndexError: list index out of range
> 
> [snip]
> 
> > Can anyone see why I'm getting this Index error? and how to fix it?
> 
> It looks to me like you are indexing into a single-element 
> list that you 
> are creating using the literal list syntax in the middle of 
> the expression.

Actually, group is essentially a 2-element list. Each group has a list
of rows, and each row has a set of fields. group has to be indexed by
row index and field index. (This is a namedtuple configuration.)

The weirdness is that 

group[0][4] 

gets the right answer, but

group[[idx][records_idx[label]]], 
where idx = 0 and records_idx[label]] = 4

gets the IndexError.

> If we were to break the expression into parts to make it a 
> bit simpler 
> to refer to discuss:
> 
> ridx = records_idx[label]
> group[[idx][ridx]]
> 
> You can now more easily see that 'group' is being indexed by the 
> expression "[idx][ridx]". What does that mean?
> 
> [idx] is creating a single-element list using literal list 
> syntax. This 
> is then indexed using 'ridx' (using, perhaps confusingly, the 
> exact same 
> syntax to do a different thing).
> 
> The result of *that* expression is then being used to index 
> 'group', but 
> it won't get that far because you'll get the exception if 'ridx' is 
> anything but zero.
> 
> So the initial problem at least is the extra [] around 'idx' which is 
> creating a list on the fly for you.
> 
> E.
> 

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Namedtuples problem

2017-02-23 Thread Deborah Swanson
Peter Otten wrote, on February 23, 2017 2:34 AM
> 
> Deborah Swanson wrote:
> 

>
> > Can anyone see why I'm getting this Index error? and how to fix it?
> 
> I'm not completely sure I can follow you, but you seem to be 
> mixing two problems
> 
> (1) split a list into groups
> (2) convert a list of rows into a list of columns

Actually, I was trying to duplicate your original intentions, which I
thought were quite excellent, but turned out to cause problems when I
tried to use the code you gave.

Your original intention was to make a dictionary with the keys being
each of the unique titles in all the records, and the values are the
complete records that contain the unique title. (Each rental listing can
have many records, each with the same title.)

> and making a kind of mess in the process. Functions to the rescue:

I'm sorry you think I made a mess of it and I agree that the code I
wrote is clumsy, although it does work and gives the correct results up
to the last line I gave. I was hoping, among other things, that you
would help me clean it up, so let's look at what you said.

> #untested
> 
> def split_into_groups(records, key):
> groups = defaultdict(list)
> for record in records:
> # no need to check if a group already exists
> # an empty list will automatically added for every 
> # missing key
> groups[key(record)].append(record)
> return groups

I used this approach the first time I tried this for both defaultdict
and OrderedDict, and for both of them I immediately got a KeyError for
the first record. groups is empty, so the title for the first record
wouldn't already be in groups.

Just to check, I commented out the extra lines that I added to handle
new keys in my code and immediately got the same KeyError.

My guess is that while standard dictionaries will automatically make a
new key if it isn't found in the dict, defaultdict and OrderedDict will
not. So it seems you need to handle new keys yourself. Unless you think
I'm doing something wrong and dicts from collections should also
automatically make new keys.

Rightly or wrongly, I chose not to use functions in my attempt, just to
keep the steps sequential and understandable. Probably I should have
factored out the functions before I posted it.

> def extract_column(records, name):
> # you will agree that extracting one column is easy :)
> return [getattr(record, name) for record in records]
> 
> def extract_columns(records, names):
> # we can build on that to make a list of columns
> return [extract_column(records, name) for name in names]
> 
> wanted_columns = ['Location', ...]
> records = ...
> groups = split_into_groups(records, operator.attrgetter("title"))
> 
> Columns = namedtuple("Columns", wanted_columns)
> for title, group in groups.items():
> # for easier access we turn the list of columns
> # into a namedtuple of columns
> groups[title] = Columns._make(extract_columns(wanted_columns))

This approach essentially reverses the order of the steps from what I
did, making the columns first and then grouping the records by title.
Either order should work in principle.

> If all worked well you should now be able to get a group with
> 
> group["whatever"]
> 
> and all locations for that group with
> 
> group["whatever"].Locations
> 
> If there is a bug you can pinpoint the function that doesn't work and
ask 
> for specific help on that one.

I'll play with your code and see if it works better than what I had. I
can see right off that the line

group["whatever"].Locations

will fail because group only has a 'Location' field and doesn't have a
'Locations' field.

Running it in the watch window confirms, and it gets:

AttributeError: 'Record' object has no attribute 'Locations'

Earlier on I tried several methods to get an anology to your line

group["whatever"].Locations, 

and failing to do it straightforwardly is part of why my code is so
convoluted.

Many thanks for your reply. Quite possibly getting the columns first and
grouping the records second will be an improvement.

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Namedtuples problem

2017-02-23 Thread Deborah Swanson
Thanks all of you for replying.

I'm going to have to study your responses a bit before I can respond. I
wrote this code the way I did because I was getting errors that I
stopped getting with these fixes and I got the correct results from this
code, up until the last line. All of you introduced concepts that I
haven't seen or thought of before and I'll need to look at them
carefully.

Peter's right about pandas and we talked about this the first time we
visited this problem. pandas is a steep learning curve, and it makes
more sense to stick with straight Python this first time.  This
spreadsheet problem is fairly typical of several I need to do, and I
will be using pandas for future problems. But this time I'll stick with
learning and understanding what I've got.

Thanks again all, and I'll write back when I have a handle on what
you've said, and probably I'll have more questions to ask.



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Namedtuples problem

2017-02-23 Thread Peter Otten
Pavol Lisy wrote:

> If we are talking about less code than necessary then probably we
> could use something from python's ecosystem...
> 
 import pandas as pd  # really useful package

Yes, pandas looks like the perfect fit here.

There is a learning curve, though, so that it will probably pay off only 
after the second project ;) 

For example here's what my session might have looked like:

>>> df = pd.read_csv("what_location.csv")
>>> df
  whatlocation
0  foohere
1  foo   there
2  bar  eberywhere

[3 rows x 2 columns]
>>> df.groupby("what")

>>> print(df.groupby("what"))

>>> df.groupby("what")["foo"]
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/lib/python3/dist-packages/pandas/core/groupby.py", line 2477, 
in __getitem__
raise KeyError(str(key))
KeyError: 'foo'
>>> df.groupby("what").keys()
Traceback (most recent call last):
  File "", line 1, in 
TypeError: 'str' object is not callable
>>> df.groupby("what").keys
'what'

Frustrating.

PS:
>>> g.groups
{'foo': [0, 1], 'bar': [2]}


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Namedtuples problem

2017-02-23 Thread Pavol Lisy
On 2/23/17, Peter Otten <__pete...@web.de> wrote:
> Peter Otten wrote:
>
>> Functions to the rescue:
>
> On second thought this was still more code than necessary.

If we are talking about less code than necessary then probably we
could use something from python's ecosystem...

>>> import pandas as pd  # really useful package

>>> import io  # I simlulate file in csv with header (like Deborah has)
>>> data = """\
what,location
foo,here
foo,there
bar,eberywhere"""

>>> df = pd.read_csv(io.StringIO(data))  # reading csv into DataFrame is very 
>>> simple

>>> print(df[df.what == 'foo'])  # data where what == foo
  what location
0  foo here
1  foothere

>>> list(df[df.what=='foo'].location.values)  # list of values
['here', 'there']

>>> list(df.itertuples())  # if we want to use namedtuples
[Pandas(Index=0, what='foo', location='here'), Pandas(Index=1,
what='foo', location='there'), Pandas(Index=2, what='bar',
location='eberywhere')]

etc.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Namedtuples problem

2017-02-23 Thread Steve D'Aprano
On Thu, 23 Feb 2017 08:38 pm, Deborah Swanson wrote:

> However,
> 
> group[[idx][records_idx[label]]]
> gets an Index Error: list index out of range

That's not very helpful: judging from that line alone, there could be as
many as THREE places in that line of code that might generate IndexError.

When you have code complex enough that you can't understand the error it
gives, break it up into separate steps:

# group[[idx][records_idx[label]]]
i = records_idx[label]
j = [idx][i]
group[j]

Now at least you will find out which step is failing!

Since records_idx is a dict, if that fails it should give KeyError, not
IndexError. So it won't be that line, at least not if my reading of your
code is correct.

The most likely suspect is the second indexing operation, which looks mighty
suspicious:

[idx][i]


What a strange thing to do: you create a list with a single item, idx, and
then index into that list with some value i. Either i is 0, in which case
you get idx, or else you get an error.

So effectively, you are looking up:

group[idx]

or failing.



> I've run into this kind of problem with namedtuples before,

This is nothing to do with namedtuples. You happen to be using namedtuples
elsewhere in the code, but that's not what is failing. namedtuple is just
an innocent bystander.


> trying to 
> access field values with variable names, like:
> 
> label = 'Location'
> records.label
> 
> and I get something like "'records' has no attribute 'label'.

Which has nothing to do with this error.

When the attribute name is known only in a variable, the simplest way to
look it up is:

getattr(records, label)




-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Namedtuples problem

2017-02-23 Thread Peter Otten
Peter Otten wrote:

> Functions to the rescue:

On second thought this was still more code than necessary.
Why not calculate a column on demand? Here's a self-containe example:

$ cat swanson_grouped_columns_virtual.py
import operator
from collections import defaultdict, namedtuple

def split_into_groups(records, key, GroupClass=list):
groups = defaultdict(GroupClass)
for record in records:
groups[key(record)].append(record)
return groups

class Group(list):
def __getattr__(self, name):
return [getattr(item, name) for item in self]

Record = namedtuple("Record", "title location")

records = [
Record("foo", "here"),
Record("foo", "there"),
Record("bar", "everywhere"),
]

groups = split_into_groups(
records,
operator.attrgetter("title"),
Group
)

print(groups["foo"].location)
$ python3 swanson_grouped_columns_virtual.py
['here', 'there']



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Namedtuples problem

2017-02-23 Thread Peter Otten
Deborah Swanson wrote:

> This is how the list of namedtuples is originally created from a csv:
> 
> infile = open("E:\\Coding projects\\Pycharm\\Moving\\Moving 2017 in -
> test.csv")
> rows = csv.reader(infile)fieldnames = next(rows)
> Record = namedtuple("Record", fieldnames)
> records = [Record._make(fieldnames)]
> records.extend(Record._make(row) for row in rows)
> 
> Thanks to Peter Otten for this succinct code, and to Greg Ewing for
> suggesting namedtuples for this type of problem to begin with.
> 
> Namedtuples worked beautifully for the first two thirds of this code,
> but I've run into a snag attempting to proceed.
> 
> Here's my code up to the snag, and I'll explain afterwards what I'm
> trying to do:
> 
> import operator
> records[1:] = sorted(records[1:], key=operator.attrgetter("title",
> "Date"))
> 
> groups = defaultdict()
> for r in records[1:]:
> # if the key doesn't exist, make a new group
> if r.title not in groups.keys():
> groups[r.title] = [r]
> # if key (group) exists, append this record
> else:
> groups[r.title].append(r)
> 
> # make lookup table: indices for field names
> records_idx = {}
> for idx, label in enumerate(records[0]):
> records_idx[label] = idx
> 
> LABELS = ['Location', 'ST', 'co', 'miles', 'first', 'Kind', 'Notes'] #
> look at field values for each label on group for group in
> groups.values():
> values = []
> for idx, row in enumerate(group):
> for label in LABELS:
> values.append(group[[idx][records_idx[label]]])
> <-snag
> 
> I want to get lists of field values from the list of namedtuples, one
> list of field values for each row in each group (groups are defined in
> the section beginning with "groups = defaultdict()".
> 
> LABELS defines the field names for the columns of field values of
> interest. So all the locations in this group would be in one list, all
> the states in another list, etc. (Jussi, I'm looking at your suggestion
> for the next part.)
> 
> (I'm quite sure this bit of code could be written with list and dict
> comprehensions, but here I was just trying to get it to work, and
> comprehensions still confuse me a little.)
> 
> Using the debugger's watch window, from
> group[[idx][records_idx[label]]], I get:
> 
> idx = {int}: 0
> records_idx[label] = {int}: 4
> 
> which is the correct indices for the first row of the current group (idx
> = 0) and the first field label in LABELS, 'Location' (records_idx[label]
> = 4).
> 
> And if I look at
> 
> group[0][4] = 'Longview'
> 
> this is also correct. Longview is the Location field value for the first
> row of this group.
> 
> However,
> 
> group[[idx][records_idx[label]]]
> gets an Index Error: list index out of range
> 
> I've run into this kind of problem with namedtuples before, trying to
> access field values with variable names, like:
> 
> label = 'Location'
> records.label
> 
> and I get something like "'records' has no attribute 'label'. This can
> be fixed by using the subscript form and an index, like:
> 
> for idx, r in enumerate(records):
> ...
> records[idx] = r
> 
> But here, I get the Index Error and I'm a bit baffled why. Both
> subscripts evaluate to valid indices and give the correct value when
> explicitly used.
> 
> Can anyone see why I'm getting this Index error? and how to fix it?

I'm not completely sure I can follow you, but you seem to be mixing two 
problems

(1) split a list into groups
(2) convert a list of rows into a list of columns

and making a kind of mess in the process. Functions to the rescue:

#untested

def split_into_groups(records, key):
groups = defaultdict(list)
for record in records:
# no need to check if a group already exists
# an empty list will automatically added for every 
# missing key
groups[key(record)].append(record)
return groups

def extract_column(records, name):
# you will agree that extracting one column is easy :)
return [getattr(record, name) for record in records]

def extract_columns(records, names):
# we can build on that to make a list of columns
return [extract_column(records, name) for name in names]

wanted_columns = ['Location', ...]
records = ...
groups = split_into_groups(records, operator.attrgetter("title"))

Columns = namedtuple("Columns", wanted_columns)
for title, group in groups.items():
# for easier access we turn the list of columns
# into a namedtuple of columns
groups[title] = Columns._make(extract_columns(wanted_columns))

If all worked well you should now be able to get a group with

group["whatever"]

and all locations for that group with

group["whatever"].Locations

If there is a bug you can pinpoint the function that doesn't work and ask 
for specific help on that one.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Namedtuples problem

2017-02-23 Thread Erik

Hi,

On 23/02/17 09:38, Deborah Swanson wrote:

group[[idx][records_idx[label]]]
gets an Index Error: list index out of range


[snip]


Can anyone see why I'm getting this Index error? and how to fix it?


It looks to me like you are indexing into a single-element list that you 
are creating using the literal list syntax in the middle of the expression.


If we were to break the expression into parts to make it a bit simpler 
to refer to discuss:


ridx = records_idx[label]
group[[idx][ridx]]

You can now more easily see that 'group' is being indexed by the 
expression "[idx][ridx]". What does that mean?


[idx] is creating a single-element list using literal list syntax. This 
is then indexed using 'ridx' (using, perhaps confusingly, the exact same 
syntax to do a different thing).


The result of *that* expression is then being used to index 'group', but 
it won't get that far because you'll get the exception if 'ridx' is 
anything but zero.


So the initial problem at least is the extra [] around 'idx' which is 
creating a list on the fly for you.


E.

--
https://mail.python.org/mailman/listinfo/python-list