Re: [R] easing out of Excel

2005-01-20 Thread Greg Snow
 Paul Sorenson [EMAIL PROTECTED] 01/19/05 03:18PM 
 I know enough about R to be dangerous and our marketing people have
 asked me to automate some reporting.  Data comes from an SQL
source
 and graphs and various summaries are currently created manually in
 Excel.  The raw information is invoicing records and the reporting
is
 basically summaries by customer, region, product line etc.
 
 With function such as aggregate(), hist() and pareto() (which
someone
 on this list kindly pointed me at) I can produce something roughly
 equivalent to the current reports.
 
 My question is, are there any neat R lock out features people
here
 like to use on this kind of info, particularly when the output is
very
 visual (report is intended for marketing people).
 
 Another way of looking at this is, What kind of hidden
information
 can I extract with R that the Excel solution hasn't touched?

Since you are looking for summaries within groups, you should look at
the
lattice package and some of the plots that you can produce with it
(maybe
for each product line you can produce a lattice/trellis graph with each
panel
representing a region and different colors symbols within panels to
represent
different customers).

If we had more of an idea of what you are looking for, we could give
better
suggestions.


 For example, even the pareto plot mentioned earlier is something
the
 Excel guys haven't thought of or can't easily produce.
 
 regards
 
 BTW the tool chain I am using goes something like:
  Production (run daily):
  DB - SQL/python - CSV - R/python - images -
network
  Presentation:
  network - CGI/python - browser

It looks like you want the reports fully automated and the final result
as HTML
(to be viewed with a browser), I suggest you look at the R2HTML package
and
the sweave function (this lets you write a report in HTML with r-code
in place of 
graphs and output, then a quick run through sweave and you have a final
report
in HTML ready to be viewed).

There are also several tools available for running R through CGI, go
to: 
http://www-r.project.org/ and click on R web-servers under the
Related Projects
heading in the left column to get details.

Hope this helps,


Greg Snow, Ph.D.
Statistical Data Center
[EMAIL PROTECTED]
(801) 408-8111

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] easing out of Excel

2005-01-20 Thread Shawn Way
Definitely check out the lattice package.

One other option is to use sweave/latex mixed with RODBC.  This can be
used to produce PDF's for easy distribution as well.  I would also
consider operating this in a batch mode, the R/sweave/latex works very
well this way.


Shawn Way, PE
Engineering Manager

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Greg Snow
Sent: Thursday, January 20, 2005 10:52 AM
To: r-help@stat.math.ethz.ch; [EMAIL PROTECTED]
Subject: Re: [R] easing out of Excel

 Paul Sorenson [EMAIL PROTECTED] 01/19/05 03:18PM 
 I know enough about R to be dangerous and our marketing people have
 asked me to automate some reporting.  Data comes from an SQL
source
 and graphs and various summaries are currently created manually in
 Excel.  The raw information is invoicing records and the reporting
is
 basically summaries by customer, region, product line etc.
 
 With function such as aggregate(), hist() and pareto() (which
someone
 on this list kindly pointed me at) I can produce something roughly
 equivalent to the current reports.
 
 My question is, are there any neat R lock out features people
here
 like to use on this kind of info, particularly when the output is
very
 visual (report is intended for marketing people).
 
 Another way of looking at this is, What kind of hidden
information
 can I extract with R that the Excel solution hasn't touched?

Since you are looking for summaries within groups, you should look at
the
lattice package and some of the plots that you can produce with it
(maybe
for each product line you can produce a lattice/trellis graph with each
panel
representing a region and different colors symbols within panels to
represent
different customers).

If we had more of an idea of what you are looking for, we could give
better
suggestions.


 For example, even the pareto plot mentioned earlier is something
the
 Excel guys haven't thought of or can't easily produce.
 
 regards
 
 BTW the tool chain I am using goes something like:
  Production (run daily):
  DB - SQL/python - CSV - R/python - images -
network
  Presentation:
  network - CGI/python - browser

It looks like you want the reports fully automated and the final result
as HTML
(to be viewed with a browser), I suggest you look at the R2HTML package
and
the sweave function (this lets you write a report in HTML with r-code
in place of 
graphs and output, then a quick run through sweave and you have a final
report
in HTML ready to be viewed).

There are also several tools available for running R through CGI, go
to: 
http://www-r.project.org/ and click on R web-servers under the
Related Projects
heading in the left column to get details.

Hope this helps,


Greg Snow, Ph.D.
Statistical Data Center
[EMAIL PROTECTED]
(801) 408-8111

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] easing out of Excel

2005-01-20 Thread Paul Sorenson
Thanks for the responses to this question, I fully realise it is a rather open 
question and the open pointers are the kind of thing I am looking for.

I will look into the lattice package and layout.

Regarding the HTML output, the current tool chain assets that I have have 
been refactored over time and are almost totally driven by config files so they 
suit my purposes very well.  I will look into other possibilities at a later 
date.

For those looking for a more rigorous specification of the problem, you are 
well justified in this.  I was deliberately fuzzy since managers just want 
stuff and I thought casting a wide net would pay off.  The problem is to 
summarise information which is nothing more than sales data.  The kinds of 
columns I am dealing with look like:

date, customer, invoice_no, product, amount, sales_region, etc etc.

Managers want to know things like:
- which products are doing well
- which regions are doing well
- who are good customers
- etc

To me these are simple aggregates and sorts, with visual presentations to match.

I figure a bit of effort, R can extract considerably more useful information 
from the data.

To be honest I am just evolving it as I go, using an existing spreadsheet as a 
basis.  I try something and if it is useful then great, if not, put it down to 
learning.

cheers

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] easing out of Excel

2005-01-20 Thread Mulholland, Tom
I hesitate to add this comment since it either completely confuses people or 
they take to it very quickly.

The data that you are using is mostly categorical. I expect that tables will 
have been used in the past and that to acertain extent the graphics are 
suppossed to help with getting a quick understanding of the data.

There is a package called vcd (Visualizing Categorical Data) which is useful 
for analysing this type of data. I like the use of the mosaicplot and in 
particular the shade parameter (which is based on standardized residuals). If 
set up properly it can be used to very quickly identify sales regions that are 
doing significantly better than they were last year, customers who have 
significantly reduced purchases. Basically if you can produce a table that 
would give this information then a shaded mosaicplot can efficiently highlight 
the  significant parts of the table.

They take a little bit of getting used to at first, but if you need to analyse 
this type of data they take a lot of the guess work out of making commentary on 
the data. How useful they are depends upon the users, who as I have said seem 
to be polarised in their reactions to the output.

Tom

 -Original Message-
 From: Paul Sorenson [mailto:[EMAIL PROTECTED]
 Sent: Friday, 21 January 2005 11:33 AM
 To: r-help@stat.math.ethz.ch
 Subject: RE: [R] easing out of Excel
 
 
 Thanks for the responses to this question, I fully realise it 
 is a rather open question and the open pointers are the 
 kind of thing I am looking for.
 
 I will look into the lattice package and layout.
 
 Regarding the HTML output, the current tool chain assets 
 that I have have been refactored over time and are almost 
 totally driven by config files so they suit my purposes very 
 well.  I will look into other possibilities at a later date.
 
 For those looking for a more rigorous specification of the 
 problem, you are well justified in this.  I was deliberately 
 fuzzy since managers just want stuff and I thought casting 
 a wide net would pay off.  The problem is to summarise 
 information which is nothing more than sales data.  The kinds 
 of columns I am dealing with look like:
 
 date, customer, invoice_no, product, amount, sales_region, etc etc.
 
 Managers want to know things like:
   - which products are doing well
   - which regions are doing well
   - who are good customers
   - etc
 
 To me these are simple aggregates and sorts, with visual 
 presentations to match.
 
 I figure a bit of effort, R can extract considerably more 
 useful information from the data.
 
 To be honest I am just evolving it as I go, using an existing 
 spreadsheet as a basis.  I try something and if it is useful 
 then great, if not, put it down to learning.
 
 cheers
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] easing out of Excel

2005-01-19 Thread Paul Sorenson
I know enough about R to be dangerous and our marketing people have asked me to 
automate some reporting.  Data comes from an SQL source and graphs and 
various summaries are currently created manually in Excel.  The raw information 
is invoicing records and the reporting is basically summaries by customer, 
region, product line etc.

With function such as aggregate(), hist() and pareto() (which someone on this 
list kindly pointed me at) I can produce something roughly equivalent to the 
current reports.

My question is, are there any neat R lock out features people here like to 
use on this kind of info, particularly when the output is very visual (report 
is intended for marketing people).

Another way of looking at this is, What kind of hidden information can I 
extract with R that the Excel solution hasn't touched?

For example, even the pareto plot mentioned earlier is something the Excel guys 
haven't thought of or can't easily produce.

regards

BTW the tool chain I am using goes something like:
Production (run daily):
DB - SQL/python - CSV - R/python - images - network
Presentation:
network - CGI/python - browser

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html