Re: [Tutor] Collating date data from a csv file

2019-05-14 Thread Dave Hill
Something else to have a look at, but it might have to wait for the 
longer winter nights :-)


On 14/05/2019 06:20, Ben Hancock via Tutor wrote:

On Sun, May 12, 2019 at 07:17:53PM +0100, Alan Gauld via Tutor wrote:

On 12/05/2019 11:27, Dave Hill wrote:


I found out by accident that the Megger PAT 420 data backup is actually
an SQLite database, so that is my route for access. Having played with
Python on Raspberry Pi's, I thought I would explore Python for data
processing, and now, I have a set of programs which extract SQLite, to
'csv' then collate/process this data and produce a multi-page ODS
spreadsheet document, which lists tests by location. I also have an
application which extracts appliances requiring testing within a  
+/- 30

day window, so I have a target for testing.


You could do it all in native SQLite SQL of course.
You can tell  sqlite to output its results in Excel
CSV format and to a file rather than (or in addition to)
stdout.

So you can write a standard query and have it generate
your Excel readable file directly.

You can then automate that as a batch job in the OS...
Assuming you run the same reports regularly.

Much as I love Python sometimes the native tools are
even better...



I wonder if `pandas` would also be a helpful library for what you're 
describing? It has a `pandas.read_sql` function to read data from a 
SQL database into a pandas dataframe, which you then can manipulate:


https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_sql.html 




Good luck!

--
Ben Hancock
https://elwha1.github.io
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Collating date data from a csv file

2019-05-13 Thread Ben Hancock via Tutor

On Sun, May 12, 2019 at 07:17:53PM +0100, Alan Gauld via Tutor wrote:

On 12/05/2019 11:27, Dave Hill wrote:


I found out by accident that the Megger PAT 420 data backup is actually
an SQLite database, so that is my route for access. Having played with
Python on Raspberry Pi's, I thought I would explore Python for data
processing, and now, I have a set of programs which extract SQLite, to
'csv' then collate/process this data and produce a multi-page ODS
spreadsheet document, which lists tests by location. I also have an
application which extracts appliances requiring testing within a  +/- 30
day window, so I have a target for testing.


You could do it all in native SQLite SQL of course.
You can tell  sqlite to output its results in Excel
CSV format and to a file rather than (or in addition to)
stdout.

So you can write a standard query and have it generate
your Excel readable file directly.

You can then automate that as a batch job in the OS...
Assuming you run the same reports regularly.

Much as I love Python sometimes the native tools are
even better...



I wonder if `pandas` would also be a helpful library for what you're 
describing? It has a `pandas.read_sql` function to read data from a SQL 
database into a pandas dataframe, which you then can manipulate:


https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_sql.html


Good luck!

--
Ben Hancock
https://elwha1.github.io
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Collating date data from a csv file

2019-05-12 Thread Alan Gauld via Tutor
On 12/05/2019 11:27, Dave Hill wrote:

> I found out by accident that the Megger PAT 420 data backup is actually 
> an SQLite database, so that is my route for access. Having played with 
> Python on Raspberry Pi's, I thought I would explore Python for data 
> processing, and now, I have a set of programs which extract SQLite, to 
> 'csv' then collate/process this data and produce a multi-page ODS 
> spreadsheet document, which lists tests by location. I also have an 
> application which extracts appliances requiring testing within a  +/- 30 
> day window, so I have a target for testing.

You could do it all in native SQLite SQL of course.
You can tell  sqlite to output its results in Excel
CSV format and to a file rather than (or in addition to)
stdout.

So you can write a standard query and have it generate
your Excel readable file directly.

You can then automate that as a batch job in the OS...
Assuming you run the same reports regularly.

Much as I love Python sometimes the native tools are
even better...

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Collating date data from a csv file

2019-05-12 Thread Dave Hill

Hi David,

Firstly, thank you for your reply.

One condition of my agreeing to undertake PAT was that I got a PAT 
machine that stored the data, as I am better at programming than paperwork!


I have a Megger PAT 420 which provides a data backup to a USB stick, and 
thence transfer to my laptop, which is network connected a dual HDD 
system (Buffalo), which automatically copies the main HDD to the slave HDD.


I found out by accident that the Megger PAT 420 data backup is actually 
an SQLite database, so that is my route for access. Having played with 
Python on Raspberry Pi's, I thought I would explore Python for data 
processing, and now, I have a set of programs which extract SQLite, to 
'csv' then collate/process this data and produce a multi-page ODS 
spreadsheet document, which lists tests by location. I also have an 
application which extracts appliances requiring testing within a  +/- 30 
day window, so I have a target for testing.


My biggest problem is locating kit, and keeping up with 
removals/disposals and new acquisitions, but the guys are getting a bit 
better at communication!


I thought it would be useful to the 'management' to have a handle on 
progress, and stats on tested/new/disposed, etc, hence the latest question.


I convert the datetimestamp to Gregorian ordinal date, as this is easier 
to use in accessing the resulting 'defaultdict', do the counting and 
then convert the date back to something comprehensible for writing to an 
ODS spreadsheet.


Having seen todays posts I am going to look at wxPython, as a front-end 
(and possibly display?)


Thank you for your consideration

Dave

On 12/05/2019 04:20, David L Neil wrote:

Hi Dave,

I also volunteer to do PAT safety testing during my "20% time". 
Clambering around Snowdonia as a boy, I eschewed* the Rheilffordd yr 
Wyddfa/SMR in favor of shanks' pony...


* OK, I was made to...! For the good of my soul???


On 9/05/19 8:04 AM, Dave Hill wrote:
I have a csv file which details the results of equipment tests, I 
carry out PAT testing as a volunteer at a heriatge railway in N. 
Wales. I want to extract how many items were tested on each test day. 
So far I have generated a List of test dates, but I am now stalled at 
how to efficiently count numbers tested on each date.


Can I have a list of tuples, where one item is the date and the 
second the count?


or is there a better construct?

Thanks in advance,

Dave

For completeness, I have listed below an extract from a target file, 
where the 10 digit number is the UNIX timestamp


182 1515001232
 Toaster 13 2000 1
183 1515001259    Contact Grill 13 2000 1
245 1515001367
 3G Cube Adaptor 13 0 1
246 1515001396     13A IEC Lead 5 0 1
248 1515001415
 Worktop Light 3 30 1
420 1515001440
 Fly killer 0 0 1
424 1515001461
 Dairy fridge 13 0 1
427 1513277293    Fire 13 0 1
429 1515001489
 Toaster Avanti 13 0 1



When you say "target file", is this coming off the tester via a link 
cable to your PC, or are you capturing by hand to a spreadsheet?


A tactic which many people 'miss' is that a workbook may contain 
multiple spreadsheets, and that the data on one spreadsheet may be 
auto-magically 'copied' onto another. Thus if the above is data coming 
off the PAT into one spreadsheet, I would immediately create a more 
meaningful sheet, 'for human consumption', which has column headings 
and converts (re-formats) the timestamp into a readable date (as 
suggested elsewhere), but is otherwise pretty-much a direct copy. We 
now have a sheet used for data capture/computer processing and 
something separate (and prettier) as a report/presentation for people.


From the spec, above, we are only interested in the date. Remember 
that considering the whole timestamp only makes life confusing. So 
convert them (only) to dates. These can be strings because Python 
compares strings as easily as dates!  The time component could be 
retained if sequence (of testing) might be important.


The sad reality is that a daily count could be accomplished in either 
LO-Writer or MS-Excel. No reason why you shouldn't use Python though.


(Assuming that the data appears in (forward or reverse) date sequence) 
Read-in the data sheet/CSV file, row-by-row, taking note of the date 
of the first data-entry, and starting to count from one. Then 
increment for each row where the date matches. When the dates don't 
match, report, reset the counter, and note the new date.


How will you lay-out and present this report? Another spreadsheet? 
Screen? Paper?


When you say "count numbers tested on each date", the above method 
will let you know a (single) daily total of tests-performed.


Did you (also) mean that you want to track how many of tests were 
performed within categories of devices, eg how many toasters on the 
one day? In which case, furth

Re: [Tutor] Collating date data from a csv file

2019-05-12 Thread David L Neil

Hi Dave,

I also volunteer to do PAT safety testing during my "20% time". 
Clambering around Snowdonia as a boy, I eschewed* the Rheilffordd yr 
Wyddfa/SMR in favor of shanks' pony...


* OK, I was made to...! For the good of my soul???


On 9/05/19 8:04 AM, Dave Hill wrote:
I have a csv file which details the results of equipment tests, I carry 
out PAT testing as a volunteer at a heriatge railway in N. Wales. I want 
to extract how many items were tested on each test day. So far I have 
generated a List of test dates, but I am now stalled at how to 
efficiently count numbers tested on each date.


Can I have a list of tuples, where one item is the date and the second 
the count?


or is there a better construct?

Thanks in advance,

Dave

For completeness, I have listed below an extract from a target file, 
where the 10 digit number is the UNIX timestamp


182 1515001232
 Toaster 13 2000 1
183 1515001259    Contact Grill 13 2000 1
245 1515001367
 3G Cube Adaptor 13 0 1
246 1515001396     13A IEC Lead 5 0 1
248 1515001415
 Worktop Light 3 30 1
420 1515001440
 Fly killer 0 0 1
424 1515001461
 Dairy fridge 13 0 1
427 1513277293    Fire 13 0 1
429 1515001489
 Toaster Avanti 13 0 1



When you say "target file", is this coming off the tester via a link 
cable to your PC, or are you capturing by hand to a spreadsheet?


A tactic which many people 'miss' is that a workbook may contain 
multiple spreadsheets, and that the data on one spreadsheet may be 
auto-magically 'copied' onto another. Thus if the above is data coming 
off the PAT into one spreadsheet, I would immediately create a more 
meaningful sheet, 'for human consumption', which has column headings and 
converts (re-formats) the timestamp into a readable date (as suggested 
elsewhere), but is otherwise pretty-much a direct copy. We now have a 
sheet used for data capture/computer processing and something separate 
(and prettier) as a report/presentation for people.


From the spec, above, we are only interested in the date. Remember that 
considering the whole timestamp only makes life confusing. So convert 
them (only) to dates. These can be strings because Python compares 
strings as easily as dates!  The time component could be retained if 
sequence (of testing) might be important.


The sad reality is that a daily count could be accomplished in either 
LO-Writer or MS-Excel. No reason why you shouldn't use Python though.


(Assuming that the data appears in (forward or reverse) date sequence) 
Read-in the data sheet/CSV file, row-by-row, taking note of the date of 
the first data-entry, and starting to count from one. Then increment for 
each row where the date matches. When the dates don't match, report, 
reset the counter, and note the new date.


How will you lay-out and present this report? Another spreadsheet? 
Screen? Paper?


When you say "count numbers tested on each date", the above method will 
let you know a (single) daily total of tests-performed.


Did you (also) mean that you want to track how many of tests were 
performed within categories of devices, eg how many toasters on the one 
day? In which case, further design consideration is required, eg which 
devices fit into which category and how to match "Toaster" with "Toaster 
Avanti"...


--
Regards =dn
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Collating date data from a csv file

2019-05-09 Thread Alan Gauld via Tutor
On 09/05/2019 14:06, Dave Hill wrote:
> Thank you, I now have
> 
> defaultdict(, {736697: 10, 736677: 14, 736980: 9, 737109: 
> 50, 736919: 15, 736652: 19, 736502: 14, 736710: 2, 736674: 6, 736672: 5, 
> 736933: 2, 736932: 6, 736658: 7, 736671: 5, 736499: 6, 736707: 4, 
> 737181: 4, 736686: 2, ...
> 
> where the first number is the proleptic Gregorian ordinal of the date 
> 8-) ( I had to look up what proleptic meant)
> 
> This means I can access the elements by the ordinal of the date, for 
> later processing, and extraction to a spreadsheet

It might be easier to use a DateTime object instead of the ordinal.
It might make printing the results easier later on. You'll find
it in the datetime module...

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Collating date data from a csv file

2019-05-09 Thread Dave Hill

Thank you, I now have

defaultdict(, {736697: 10, 736677: 14, 736980: 9, 737109: 
50, 736919: 15, 736652: 19, 736502: 14, 736710: 2, 736674: 6, 736672: 5, 
736933: 2, 736932: 6, 736658: 7, 736671: 5, 736499: 6, 736707: 4, 
737181: 4, 736686: 2, ...


where the first number is the proleptic Gregorian ordinal of the date 
8-) ( I had to look up what proleptic meant)


This means I can access the elements by the ordinal of the date, for 
later processing, and extraction to a spreadsheet


Dave


On 09/05/2019 04:08, Cameron Simpson wrote:

On 08May2019 21:04, Dave Hill  wrote:
I have a csv file which details the results of equipment tests, I 
carry out PAT testing as a volunteer at a heriatge railway in N. 
Wales. I want to extract how many items were tested on each test day. 
So far I have generated a List of test dates, but I am now stalled at 
how to efficiently count numbers tested on each date.


Can I have a list of tuples, where one item is the date and the 
second the count?


Not as such, because you can't modify a tuple (so you can't update the 
count part). But you could use a 2 element list.



or is there a better construct?


Oh definitely. The easiest thing would be a defaultdict(int). Example:

 from collections import defaultdict
 ...
 by_date = defaultdict(int)
 for row in csvdata:
   timestamp = row[1]  # based on your example data
   # get the date from the timestamp
   date = ...
   by_date[date] += 1

A defaultdict is a dict which magicly makes missing elements when they 
get access, using a factory function you supply. Here we're using 
"int" as that factory, as int() returns zero.


I presume you've got the timestamp => date conversion sorted?

Cheers,
Cameron Simpson 

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Collating date data from a csv file

2019-05-09 Thread Peter Otten
Cameron Simpson wrote:

> On 08May2019 21:04, Dave Hill  wrote:
>>I have a csv file which details the results of equipment tests, I
>>carry out PAT testing as a volunteer at a heriatge railway in N.
>>Wales. I want to extract how many items were tested on each test day.
>>So far I have generated a List of test dates, but I am now stalled at
>>how to efficiently count numbers tested on each date.
>>
>>Can I have a list of tuples, where one item is the date and the second
>>the count?
> 
> Not as such, because you can't modify a tuple (so you can't update the
> count part). But you could use a 2 element list.
> 
>>or is there a better construct?
> 
> Oh definitely. The easiest thing would be a defaultdict(int). Example:
> 
>   from collections import defaultdict
>   ...
>   by_date = defaultdict(int)
>   for row in csvdata:
> timestamp = row[1]  # based on your example data
> # get the date from the timestamp
> date = ...
> by_date[date] += 1
> 
> A defaultdict is a dict which magicly makes missing elements when they
> get access, using a factory function you supply. Here we're using "int"
> as that factory, as int() returns zero.

While this is easily adaptable if you want to keep more data...

by_date = defaultdict(list)  # rows grouped by date
for row in csvdata:
   date = ...
   by_date[date].append(row)

... for the simple case there is also collections.Counter:

def get_date(row):
return datetime.datetime.fromtimestamp(int(row[1])).date()

by_date = collections.Counter(map(get_date, csvdata))

# (date, freq) pairs ordered by frequency:
print(by_date.most_common())

> 
> I presume you've got the timestamp => date conversion sorted?
> 
> Cheers,
> Cameron Simpson 
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Collating date data from a csv file

2019-05-08 Thread boB Stepp
On Wed, May 8, 2019 at 10:29 PM boB Stepp  wrote:
>
> On Wed, May 8, 2019 at 10:09 PM Cameron Simpson  wrote:
> >
> > A defaultdict is a dict which magicly makes missing elements when they
> > get access, using a factory function you supply. Here we're using "int"
> > as that factory, as int() returns zero.
>
> Is int() guaranteed to always return zero as Python versions progress?
>  More importantly, perhaps, where would I go to look to find the
> answer to this question myself?

It must be time to go to bed!  Somehow my eyes missed the int() return
behavior at https://docs.python.org/3/library/functions.html#int my
first time through.  Sorry for the noise!

> boB



-- 
boB
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Collating date data from a csv file

2019-05-08 Thread boB Stepp
On Wed, May 8, 2019 at 10:09 PM Cameron Simpson  wrote:
>
> A defaultdict is a dict which magicly makes missing elements when they
> get access, using a factory function you supply. Here we're using "int"
> as that factory, as int() returns zero.

Is int() guaranteed to always return zero as Python versions progress?
 More importantly, perhaps, where would I go to look to find the
answer to this question myself?


-- 
boB
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Collating date data from a csv file

2019-05-08 Thread Cameron Simpson

On 08May2019 21:04, Dave Hill  wrote:
I have a csv file which details the results of equipment tests, I 
carry out PAT testing as a volunteer at a heriatge railway in N. 
Wales. I want to extract how many items were tested on each test day. 
So far I have generated a List of test dates, but I am now stalled at 
how to efficiently count numbers tested on each date.


Can I have a list of tuples, where one item is the date and the second 
the count?


Not as such, because you can't modify a tuple (so you can't update the 
count part). But you could use a 2 element list.



or is there a better construct?


Oh definitely. The easiest thing would be a defaultdict(int). Example:

 from collections import defaultdict
 ...
 by_date = defaultdict(int)
 for row in csvdata:
   timestamp = row[1]  # based on your example data
   # get the date from the timestamp
   date = ...
   by_date[date] += 1

A defaultdict is a dict which magicly makes missing elements when they 
get access, using a factory function you supply. Here we're using "int" 
as that factory, as int() returns zero.


I presume you've got the timestamp => date conversion sorted?

Cheers,
Cameron Simpson 
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor