[Tutor] python clusters

2011-04-13 Thread nookasree ponamala
Hi 

I've 30 variables in a text file and I want to read this text file and create 
10 clusters based on 18 variables.

I want to read an other text file and find the closest match  using these 
clusters 

Could you pls. help me with this.

Thanks,
Sree.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Need help with dates in Python

2011-03-11 Thread nookasree ponamala
Thanks for your help Francesco. This works.

Sree.

--- On Fri, 3/11/11, Francesco Loffredo f...@libero.it wrote:

 From: Francesco Loffredo f...@libero.it
 Subject: Re: [Tutor] Need help with dates in Python
 To: tutor@python.org
 Date: Friday, March 11, 2011, 1:05 AM
 On 09/03/2011 9.21, nookasree
 ponamala wrote:
  Hi,
 
  I need help in finding the minimum date and maximum
 date in a file.
  Here is my test file:
  s.no:    dt1   
 amt    id1    id2
  452     2010-02-20   
   $23.26      059542     
   06107
  452     2010-02-05   
   $20.78      059542     
   06107
  451     2010-02-24   
   $5.99       059542 
       20151
  452     2010-02-12   
   $114.25     839745   
     98101
  452     2010-02-06   
   $28.00      839745     
   06032
  451     2010-02-12   
   $57.00      839745     
   06269
 
  I want to get the minimum and maximum dt1 for each
 id1
 
  Required result:
 
  id1 mindate maxdate
  059542    2010-02-24   
 2010-02-20        
  839745    2010-02-06   
 2010-02-12
 
  Code: The code I tried. It doesn't work though.
 
 I noticed that your dates are formatted in a way that makes
 it easy to compare them as strings.
 This allows you not only to do without splitting dates into
 year, month and day, but also to do without the datetime
 module:
 I'm also, AFAIK, the first one to address your need for the
 min and max date FOR EACH ID1, not in the whole file.
 
 .    ids = {}  # create an empty dictionary
 to store results
 .    for L in open(test.txt, r):
 .      S = L.split()  # allow direct
 access to fields
 .      if S[3] in ids:
 .        mindate, maxdate =
 ids[S[3]]  # current stored minimum and maximum date
 .        if S[1]  mindate:
 .          mindate = S[1]
 .        if S[1]  maxdate:
 .          maxdate = S[1]
 .        ids[S[3]] = (mindate,
 maxdate)  # new stored min and max
 .      else:
 .        ids[S[3]] = (S[1], S[1]) 
 # initialize storage for the current id1, with min and max
 in a tuple
 .    #leave print formatting as an exercise to
 the reader (but you can do without it!)
 .    print ids
 
 Hope this helps...
 Francesco
 
 
 -
 Nessun virus nel messaggio.
 Controllato da AVG - www.avg.com
 Versione: 10.0.1204 / Database dei virus: 1497/3495 - 
 Data di rilascio: 09/03/2011
 
 ___
 Tutor maillist  -  Tutor@python.org
 To unsubscribe or change subscription options:
 http://mail.python.org/mailman/listinfo/tutor
 


  
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Need help with dates in Python

2011-03-10 Thread nookasree ponamala
Hi All:
 
Thanks for all of your answers. I could solve this problem using strptime 
(datetime.datetime.strptime(dt1, '%Y-%m-%d') and correct indentation.
 
As I am new to python programming, I don't know how to use classes yet, still 
learning.
 
Thanks,
Sree.
 
 

 
--- On Thu, 3/10/11, James Reynolds eire1...@gmail.com wrote:


From: James Reynolds eire1...@gmail.com
Subject: Re: [Tutor] Need help with dates in Python
To: nookasree ponamala nookas...@yahoo.com
Cc: Andre Engels andreeng...@gmail.com, tutor@python.org
Date: Thursday, March 10, 2011, 2:26 AM





On Wed, Mar 9, 2011 at 1:34 PM, nookasree ponamala nookas...@yahoo.com wrote:

Hi,
I'm new to Python programming. I've changed the code to below, but still it is 
not working, Could you pls. make the corrections in my code.

import datetime
t = ()
tot = []

min = datetime.date(2008, 1, 1)
max = datetime.date(2012, 12, 31)

for line in open ('test2.txt','r'):
       data = line.rstrip().split()
       a = data[9]
       b = data[4]

       (year, month, day) = b.split('-')
       year = int(year)
       month = int(month)
       day = int(day)
       t = (year,month,day)
               if t  max:
               maxyr = max
               if t  min:
               minyr = min
               t = (a,b,maxyr,minyr)

               tot.append(t)
               print t

Thanks
Sree.

--- On Wed, 3/9/11, Andre Engels andreeng...@gmail.com wrote:

 From: Andre Engels andreeng...@gmail.com
 Subject: Re: [Tutor] Need help with dates in Python
 To: nookasree ponamala nookas...@yahoo.com
 Cc: tutor@python.org
 Date: Wednesday, March 9, 2011, 2:16 PM



 On Wed, Mar 9, 2011 at 9:21 AM,
 nookasree ponamala nookas...@yahoo.com
 wrote:
  Hi,
 
  I need help in finding the minimum date and maximum
 date in a file.
  Here is my test file:
  s.no:   dt1     amt     id1     id2
  452     2010-02-20      $23.26      059542  
      06107
  452     2010-02-05      $20.78      059542  
      06107
  451     2010-02-24      $5.99       059542  
      20151
  452     2010-02-12      $114.25     839745  
      98101
  452     2010-02-06      $28.00      839745  
      06032
  451     2010-02-12      $57.00      839745  
      06269
 
  I want to get the minimum and maximum dt1 for each
 id1
 
  Required result:
 
  id1 mindate maxdate
  059542  2010-02-24      2010-02-20
  839745  2010-02-06      2010-02-12
 
  Code: The code I tried. It doesn't work though.
 
  import sys
  import os
  t = ()
  tot = []
  maxyr = 2012
  minyr = 2008
  maxday = 31
  minday = 1
  maxmon = 12
  minmon = 1
 
  for line in open ('test2.txt','r'):
         data = line.rstrip().split()
         a = data[3]
         b = data[1]
         (year, month, day) = b.split('-')
         year = int(year)
         month = int(month)
         day = int(day)
  if year  maxyr:
         maxyr = year
  elif year  minyr:
         minyr = year
  if month  maxmon:
         maxmon = month
         elif month  minmon:
         minmon = month
         if day  maxday:
         maxday = day
         elif day  minday:
         minday = day
         max = (maxyr,maxmon,maxday)
         min = (minyr,minmon,minday)
         t = (a,b,max,min)
         tot.append(t)
         print t
 
  Could you pls. help me with this.

 I see several things go wrong. Here a list, which may well
 not be complete:

 * You want the mindate and maxdate for each id1, but you
 remember only
 a single minyr, maxyr etcetera. There's no way that that is
 going to
 work.
 * You initialize minyr etcetera to a date before the first
 date you
 will see, nd maxyr etcetera to a date after the last date.
 This means
 that you will never find an earlier respectively later one,
 so they
 would never be changed. You should do it exactly the other
 way around
 - minyr etcetera should be _later_ than any date that may
 occur, maxyr
 etcetera _earlier_.
 * You move if year  maxyr back to the left. This
 means that it is
 not part of the loop, but is executed (only) once _after_
 the loop has
 been gone through
 * year  minyear should be if, not elif: it is
 possible that the
 new date is both the first _and_ the last date that has
 been found
 (this will be the case with the first date)
 * You change maxyear, maxmonth and maxday independently.
 That is not
 what you are trying to do - you want the last date, not the
 highest
 year, highest month and highest day (if the dates were
 2001-12-01,
 2011-11-03 and 2005-05-30, you want the maximum date to be
 2011-11-03,
 not 2011-12-30). You should thus find a way to compare the
 *complete
 date* and then if it is later than the maxdate or earlier
 than the
 mindate change the *complete date*
 * At the end you show (well, in this case you don't because
 it is
 under if month  maxmon) a quadruple consisting of
 id1, current
 date, lowest date and highest date - EACH time. You want
 only the
 triple and only after the last date of some value of id1
 has been
 parsed (best to do that after all lines have been parsed

[Tutor] Need help with dates in Python

2011-03-09 Thread nookasree ponamala
Hi,

I need help in finding the minimum date and maximum date in a file. 
Here is my test file:
s.no:   dt1 amt id1 id2
452 2010-02-20  $23.26  05954206107
452 2010-02-05  $20.78  05954206107
451 2010-02-24  $5.99   05954220151
452 2010-02-12  $114.25 83974598101
452 2010-02-06  $28.00  83974506032
451 2010-02-12  $57.00  83974506269

I want to get the minimum and maximum dt1 for each id1

Required result:

id1 mindate maxdate
059542  2010-02-24  2010-02-20  
839745  2010-02-06  2010-02-12

Code: The code I tried. It doesn't work though.

import sys
import os
t = ()
tot = []
maxyr = 2012
minyr = 2008
maxday = 31
minday = 1
maxmon = 12
minmon = 1

for line in open ('test2.txt','r'):
data = line.rstrip().split()
a = data[3]
b = data[1]
(year, month, day) = b.split('-')
year = int(year)
month = int(month)
day = int(day)
if year  maxyr:
maxyr = year
elif year  minyr:
minyr = year
if month  maxmon:
maxmon = month
elif month  minmon:
minmon = month
if day  maxday:
maxday = day
elif day  minday:
minday = day
max = (maxyr,maxmon,maxday)
min = (minyr,minmon,minday)
t = (a,b,max,min)
tot.append(t)
print t

Could you pls. help me with this.

Thanks
Sree.



  
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Need help with dates in Python

2011-03-09 Thread nookasree ponamala
Hi,
I'm new to Python programming. I've changed the code to below, but still it is 
not working, Could you pls. make the corrections in my code.

import datetime
t = ()
tot = []
min = datetime.date(2008, 1, 1)
max = datetime.date(2012, 12, 31)
for line in open ('test2.txt','r'):
data = line.rstrip().split()
a = data[9]
b = data[4]
(year, month, day) = b.split('-')
year = int(year)
month = int(month)
day = int(day)
t = (year,month,day)
if t  max:
maxyr = max
if t  min:
minyr = min
t = (a,b,maxyr,minyr)
tot.append(t)
print t

Thanks
Sree.

--- On Wed, 3/9/11, Andre Engels andreeng...@gmail.com wrote:

 From: Andre Engels andreeng...@gmail.com
 Subject: Re: [Tutor] Need help with dates in Python
 To: nookasree ponamala nookas...@yahoo.com
 Cc: tutor@python.org
 Date: Wednesday, March 9, 2011, 2:16 PM
 On Wed, Mar 9, 2011 at 9:21 AM,
 nookasree ponamala nookas...@yahoo.com
 wrote:
  Hi,
 
  I need help in finding the minimum date and maximum
 date in a file.
  Here is my test file:
  s.no:   dt1     amt     id1     id2
  452     2010-02-20      $23.26      059542  
      06107
  452     2010-02-05      $20.78      059542  
      06107
  451     2010-02-24      $5.99       059542  
      20151
  452     2010-02-12      $114.25     839745  
      98101
  452     2010-02-06      $28.00      839745  
      06032
  451     2010-02-12      $57.00      839745  
      06269
 
  I want to get the minimum and maximum dt1 for each
 id1
 
  Required result:
 
  id1 mindate maxdate
  059542  2010-02-24      2010-02-20
  839745  2010-02-06      2010-02-12
 
  Code: The code I tried. It doesn't work though.
 
  import sys
  import os
  t = ()
  tot = []
  maxyr = 2012
  minyr = 2008
  maxday = 31
  minday = 1
  maxmon = 12
  minmon = 1
 
  for line in open ('test2.txt','r'):
         data = line.rstrip().split()
         a = data[3]
         b = data[1]
         (year, month, day) = b.split('-')
         year = int(year)
         month = int(month)
         day = int(day)
  if year  maxyr:
         maxyr = year
  elif year  minyr:
         minyr = year
  if month  maxmon:
         maxmon = month
         elif month  minmon:
         minmon = month
         if day  maxday:
         maxday = day
         elif day  minday:
         minday = day
         max = (maxyr,maxmon,maxday)
         min = (minyr,minmon,minday)
         t = (a,b,max,min)
         tot.append(t)
         print t
 
  Could you pls. help me with this.
 
 I see several things go wrong. Here a list, which may well
 not be complete:
 
 * You want the mindate and maxdate for each id1, but you
 remember only
 a single minyr, maxyr etcetera. There's no way that that is
 going to
 work.
 * You initialize minyr etcetera to a date before the first
 date you
 will see, nd maxyr etcetera to a date after the last date.
 This means
 that you will never find an earlier respectively later one,
 so they
 would never be changed. You should do it exactly the other
 way around
 - minyr etcetera should be _later_ than any date that may
 occur, maxyr
 etcetera _earlier_.
 * You move if year  maxyr back to the left. This
 means that it is
 not part of the loop, but is executed (only) once _after_
 the loop has
 been gone through
 * year  minyear should be if, not elif: it is
 possible that the
 new date is both the first _and_ the last date that has
 been found
 (this will be the case with the first date)
 * You change maxyear, maxmonth and maxday independently.
 That is not
 what you are trying to do - you want the last date, not the
 highest
 year, highest month and highest day (if the dates were
 2001-12-01,
 2011-11-03 and 2005-05-30, you want the maximum date to be
 2011-11-03,
 not 2011-12-30). You should thus find a way to compare the
 *complete
 date* and then if it is later than the maxdate or earlier
 than the
 mindate change the *complete date*
 * At the end you show (well, in this case you don't because
 it is
 under if month  maxmon) a quadruple consisting of
 id1, current
 date, lowest date and highest date - EACH time. You want
 only the
 triple and only after the last date of some value of id1
 has been
 parsed (best to do that after all lines have been parsed)
 * The code as shown will lead to a syntax error anyway
 because you did
 not indent extra after elif month  minmon:, if day
  maxday: and
 elif day  minday:.
 
 
 
 
 -- 
 André Engels, andreeng...@gmail.com
 


  
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] calculate the sum of a variable - python

2011-03-06 Thread nookasree ponamala
Hi :

I'm a Senior SAS Analyst. I'm trying to learn Python. I would appreciate if 
anybody could help me with this. It works fine if I give input  instead of 
reading a text file. I don't understand where I'm going wrong.

I'm trying to read a text file and find out the following:
1. Sum of amt for each id
2. Count of id 
3. minimum of date1
4. maximum of date1

Here is the sample text file:

test.txt file:

bin1cd1 date1   amt cdid cd2
452  2   2010-02-20  $23.26  0810005954206107
452  2   2010-02-20  $20.78  0  810005954206107
452  2   2010-02-24  $5.99   2  810083974520151
452  2   2010-02-12  $114.25 7  810083974598101
452  2   2010-02-06  $28.00  0  810114236206032
452  2   2010-02-09  $15.01  0  810027445306040
452  18  2010-02-13  $113.24 0  810027445306040
452  2   2010-02-13  $31.80  0  810027445306040


Here is the code I've tried out to calculate sum of amt by id:

import sys
from itertools import groupby
from operator import itemgetter
t = ()
tot = []
for line in open ('test.txt','r'):
aline = line.rstrip().split()
a = aline[5]
b = (aline[3].strip('$'))
t = (a,b)
t1 = str(t)
tot.append(t1)
print tot
def summary(data, key=itemgetter(0), value=itemgetter(1)):   
for k, group in groupby(data, key):
yield (k, sum(value(row) for row in group))

if __name__ == __main__:  
for id, tot_spend in summary(tot, key=itemgetter(0), 
value=itemgetter(1)):
print id, tot_spend


Error:
Traceback (most recent call last):
  File stdin, line 2, in module
  File stdin, line 3, in summary
TypeError: unsupported operand type(s) for +: 'int' and 'str'


Thanks,
Sree.


  
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] calculate the sum of a variable - python

2011-03-06 Thread nookasree ponamala
Thanks for the reply Wayne, but still it is not working,
 
when I used int It throws the below error:
  File stdin, line 2, in module
  File stdin, line 3, in summary
  File stdin, line 3, in genexpr
ValueError: invalid literal for int() with base 10: '
 
I tried using float and the error is:
Traceback (most recent call last):
  File stdin, line 2, in module
  File stdin, line 3, in summary
  File stdin, line 3, in genexpr
ValueError: invalid literal for float(): '
 
Thanks,
Sree.


--- On Mon, 3/7/11, Wayne Werner waynejwer...@gmail.com wrote:


From: Wayne Werner waynejwer...@gmail.com
Subject: Re: [Tutor] calculate the sum of a variable - python
To: nookasree ponamala nookas...@yahoo.com
Cc: tutor@python.org
Date: Monday, March 7, 2011, 9:14 AM





On Sun, Mar 6, 2011 at 9:31 PM, nookasree ponamala nookas...@yahoo.com wrote:

Hi :

I'm a Senior SAS Analyst. I'm trying to learn Python. I would appreciate if 
anybody could help me with this. It works fine if I give input  instead of 
reading a text file. I don't understand where I'm going wrong.

I'm trying to read a text file and find out the following:
1. Sum of amt for each id
2. Count of id
3. minimum of date1
4. maximum of date1

Here is the sample text file:

test.txt file:

bin1    cd1     date1   amt     cd    id cd2
452  2       2010-02-20      $23.26  0    8100059542        06107
452  2       2010-02-20      $20.78  0          8100059542        06107
452  2       2010-02-24      $5.99   2          8100839745        20151
452  2       2010-02-12      $114.25 7          8100839745        98101
452  2       2010-02-06      $28.00  0          8101142362        06032
452  2       2010-02-09      $15.01  0          8100274453        06040
452  18      2010-02-13      $113.24 0          8100274453        06040
452  2       2010-02-13      $31.80  0          8100274453        06040


Here is the code I've tried out to calculate sum of amt by id:

import sys
from itertools import groupby
from operator import itemgetter
t = ()
tot = []
for line in open ('test.txt','r'):
       aline = line.rstrip().split()
       a = aline[5]
       b = (aline[3].strip('$'))
       t = (a,b)
       t1 = str(t)
       tot.append(t1)
       print tot
def summary(data, key=itemgetter(0), value=itemgetter(1)):
       for k, group in groupby(data, key):
               yield (k, sum(value(row) for row in group))

if __name__ == __main__:
       for id, tot_spend in summary(tot, key=itemgetter(0), 
value=itemgetter(1)):
           print id, tot_spend


Error:
Traceback (most recent call last):
 File stdin, line 2, in module
 File stdin, line 3, in summary
TypeError: unsupported operand type(s) for +: 'int' and 'str'



Of course I first have to commend you for including the full traceback with the 
code because it makes this entirely easy to answer.


In general, the traceback tells you the most important stuff last, so I'll 
start with this line: 

 TypeError: unsupported operand type(s) for +: 'int' and 'str'


That tells us that the problem is you are trying to use + (addition) on an 
integer and a string - which you can't do because of the type mismatch 
(TypeError).


The next line


 File stdin, line 3, in summary


tells us that the error occurred on line3 in summary:


1 | def summary(data, key=itemgetter(0), value=itemgetter(1)):
2 |        for k, group in groupby(data, key):
3 |                yield (k, sum(value(row) for row in group))


Well, there's no '+', but you do have 'sum', which uses addition under the 
hood. So how do you go about fixing it? Well, you change the value getting 
passed to sum to an integer (or other number):


sum(int(value(row)) for row in group)


Should either fix your problem, or throw a differen error if you try to convert 
a string like 'Hello' to an integer. (Alternatively, use float if you're 
interested in decimals)


HTH,
Wayne


  ___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] calculate the sum of a variable - python

2011-03-06 Thread nookasree ponamala
Thanks a lot Marc. This works now.
 
Sree.

--- On Mon, 3/7/11, Marc Tompkins marc.tompk...@gmail.com wrote:


From: Marc Tompkins marc.tompk...@gmail.com
Subject: Re: [Tutor] calculate the sum of a variable - python
To: nookasree ponamala nookas...@yahoo.com
Cc: Wayne Werner waynejwer...@gmail.com, tutor@python.org
Date: Monday, March 7, 2011, 10:54 AM



On Sun, Mar 6, 2011 at 8:46 PM, nookasree ponamala nookas...@yahoo.com wrote:






Thanks for the reply Wayne, but still it is not working,
 
when I used int It throws the below error:

  File stdin, line 2, in module
  File stdin, line 3, in summary
  File stdin, line 3, in genexpr
ValueError: invalid literal for int() with base 10: '
 
I tried using float and the error is:

Traceback (most recent call last):
  File stdin, line 2, in module
  File stdin, line 3, in summary
  File stdin, line 3, in genexpr
ValueError: invalid literal for float(): '
 
Thanks,
Sree.



I played with it a bit and simplified things a (little) bit:

   b = (aline[3].strip('$'))
   t = (a, float(b))
   tot.append(t)
   print tot

You were converting the tuple to a string before adding it to the list; you 
don't need to do that, and it was concealing the real cause of your problem, 
which is that you either need to skip/get rid of the top line of your file, or 
write some error-handling code to deal with it.  Currently, you're trying to 
convert the string 'amt' into a number, and you just can't do that.



  ___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor