Re: [Rd] Rd2dvi (PR#9812)

2007-07-26 Thread Bill Dunlap
On Thu, 26 Jul 2007 [EMAIL PROTECTED] wrote:

> Is this a bug--
>
> ---
> <234>% R CMD Rd2dvi base.Rd
> Converting Rd files to LaTeX ...
> base.Rd
> Can't use an undefined value as filehandle reference at
> /opt/R-2.5.1/lib/R/share/perl/R/Rdconv.pm line 78.

This may be due to a change I suggested a while back which
required perl 5.6 (or so) to work.  The change was to ensure
that the file handle rdfile was closed when Rdconv was done
with it.  If this is the problem, upgrading perl to 5.8 will
make it go away.  Rdconv.pm should have a 'use v5.6' (or 5.8?)
line at the top if it wants to continue to use this syntax.

< open(rdfile, "<$Rdname") or die "Rdconv(): Couldn't open '$Rdfile': $!\n";
<
---
> open(my $rdfile, "<$Rdname") or die "Rdconv(): Couldn't open '$Rdfile': 
> $!\n";
> # Before we added the 'my $' in front of rdfile,
> # rdfile was not getting closed.   Now it will close
> # when $rdfile goes out of scope.  (We could have added
> # a close rdfile at the end of the while(), but
> # scoping method is more reliable.
123c127
< while(){
---
> while(<$rdfile>){


Bill Dunlap
Insightful Corporation
bill at insightful dot com
360-428-8146

 "All statements in this message represent the opinions of the author and do
 not necessarily reflect Insightful Corporation policy or position."

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Rd2dvi (PR#9812)

2007-07-26 Thread larryh
Is this a bug--

---
<234>% R CMD Rd2dvi base.Rd
Converting Rd files to LaTeX ...
base.Rd
Can't use an undefined value as filehandle reference at 
/opt/R-2.5.1/lib/R/share/perl/R/Rdconv.pm line 78.
ENCS is 
Creating dvi output from LaTeX ...
Saving output to 'base.dvi' ...
cp: cannot access .Rd2dvi26632/Rd2.dvi
Done
xdvi-xaw.bin: Fatal error: base.dvi: No such file.
<235>% ls base.Rd
base.Rd
<236>% uname -a
SunOS strauss.udel.edu 5.9 Generic_112233-12 sun4u sparc SUNW,Sun-Fire
<237>% R --version
R version 2.5.1 (2007-06-27)
---

R was installed yesterday (7/25/2007). (I'm not the installer.)

Thanks,

Larry Hotchkiss



Larry Hotchkiss
University of Delaware
IT User Services -- Smith Hall
Newark, DE 19716 
302-831-1989  [EMAIL PROTECTED]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] sequence(c(2, 0, 3)) produces surprising results, would (PR#9813)

2007-07-26 Thread bill
On Thu, 26 Jul 2007 [EMAIL PROTECTED] wrote:

> Full_Name: Bill Dunlap
> Version: 2.5.0
> OS: Linux
> Submission from: (NULL) (70.98.76.47)
>
> sequence(nvec) is documented to return
> the concatenation of seq(nvec[i]), for
> i in seq(along=nvec).  This produces inconvenient
> (for me) results for 0 inputs.
> > sequence(c(2,0,3)) # would like 1 2 1 2 3, ignore 0
> [1] 1 2 1 0 1 2 3
> Would changing sequence(nvec) to use seq_len(nvec[i])
> instead of the current 1:nvec[i] break much existing code?
>
> On the other hand, almost no one seems to use sequence()
> and it might make more sense to allow seq_len() and seq()
> to accept a vector for length.out and they would return a
> vector of length sum(length.out),
> c(seq_len(length.out[1]), seq_len(length.out[2]), ...)

seq_len() could be changed to do that with the following
code change.  It does slow down seq_len in the scalar case
 old timenew time
for(i in 1:1e6)seq_len(2)1.251   1.516
for(i in 1:1e6)seq_len(20)   1.690   1.990
for(i in 1:1e6)seq_len(200)  5.480   5.860

It becomes much faster than sequence in the vectorized case.
   > unix.time(for(i in 1:1e4)sequence(20:1))
  user  system elapsed
 1.550   0.000   1.557
   > unix.time(for(i in 1:1e4)seq_len(20:1))
  user  system elapsed
 0.070   0.000   0.066
   > identical(sequence(20:1), seq_len(20:1))
   [1] TRUE
My problem cases are where the length.out vector is long
and contains small integers (e.g., the output of table
on a vector of mostly unique values).

Index: src/main/seq.c
===
--- src/main/seq.c  (revision 42329)
+++ src/main/seq.c  (working copy)
@@ -594,16 +594,31 @@

 SEXP attribute_hidden do_seq_len(SEXP call, SEXP op, SEXP args, SEXP rho)
 {
-SEXP ans;
-int i, len, *p;
+SEXP ans, slengths;
+int i, *p, anslen, *lens, nlens, ilen, nprotected=0 ;

 checkArity(op, args);
-len = asInteger(CAR(args));
-if(len == NA_INTEGER || len < 0)
-   errorcall(call, _("argument must be non-negative"));
-ans = allocVector(INTSXP, len);
+slengths = CAR(args);
+if (TYPEOF(slengths) != INTSXP) {
+   PROTECT(slengths = coerceVector(CAR(args), INTSXP));
+nprotected++;
+}
+lens = INTEGER(slengths);
+nlens = LENGTH(slengths);
+anslen = 0 ;
+for(ilen=0;ilen0)
+UNPROTECT(nprotected);
 return ans;
 }

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] sequence(c(2, 0, 3)) produces surprising results, would like output length to be sum(input) (PR#9811)

2007-07-26 Thread Bill Dunlap
On Thu, 26 Jul 2007 [EMAIL PROTECTED] wrote:

> Full_Name: Bill Dunlap
> Version: 2.5.0
> OS: Linux
> Submission from: (NULL) (70.98.76.47)
>
> sequence(nvec) is documented to return
> the concatenation of seq(nvec[i]), for
> i in seq(along=nvec).  This produces inconvenient
> (for me) results for 0 inputs.
> > sequence(c(2,0,3)) # would like 1 2 1 2 3, ignore 0
> [1] 1 2 1 0 1 2 3
> Would changing sequence(nvec) to use seq_len(nvec[i])
> instead of the current 1:nvec[i] break much existing code?
>
> On the other hand, almost no one seems to use sequence()
> and it might make more sense to allow seq_len() and seq()
> to accept a vector for length.out and they would return a
> vector of length sum(length.out),
> c(seq_len(length.out[1]), seq_len(length.out[2]), ...)

seq_len() could be changed to do that with the following
code change.  It does slow down seq_len in the scalar case
 old timenew time
for(i in 1:1e6)seq_len(2)1.251   1.516
for(i in 1:1e6)seq_len(20)   1.690   1.990
for(i in 1:1e6)seq_len(200)  5.480   5.860

It becomes much faster than sequence in the vectorized case.
   > unix.time(for(i in 1:1e4)sequence(20:1))
  user  system elapsed
 1.550   0.000   1.557
   > unix.time(for(i in 1:1e4)seq_len(20:1))
  user  system elapsed
 0.070   0.000   0.066
   > identical(sequence(20:1), seq_len(20:1))
   [1] TRUE
My problem cases are where the length.out vector is long
and contains small integers (e.g., the output of table
on a vector of mostly unique values).

Index: src/main/seq.c
===
--- src/main/seq.c  (revision 42329)
+++ src/main/seq.c  (working copy)
@@ -594,16 +594,31 @@

 SEXP attribute_hidden do_seq_len(SEXP call, SEXP op, SEXP args, SEXP rho)
 {
-SEXP ans;
-int i, len, *p;
+SEXP ans, slengths;
+int i, *p, anslen, *lens, nlens, ilen, nprotected=0 ;

 checkArity(op, args);
-len = asInteger(CAR(args));
-if(len == NA_INTEGER || len < 0)
-   errorcall(call, _("argument must be non-negative"));
-ans = allocVector(INTSXP, len);
+slengths = CAR(args);
+if (TYPEOF(slengths) != INTSXP) {
+   PROTECT(slengths = coerceVector(CAR(args), INTSXP));
+nprotected++;
+}
+lens = INTEGER(slengths);
+nlens = LENGTH(slengths);
+anslen = 0 ;
+for(ilen=0;ilen0)
+UNPROTECT(nprotected);
 return ans;
 }

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] sequence(c(2, 0, 3)) produces surprising results, would like output length to be sum(input) (PR#9811)

2007-07-26 Thread bill
Full_Name: Bill Dunlap
Version: 2.5.0
OS: Linux
Submission from: (NULL) (70.98.76.47)


sequence(nvec) is documented to return
the concatenation of seq(nvec[i]), for
i in seq(along=nvec).  This produces inconvenient
(for me) results for 0 inputs.
> sequence(c(2,0,3)) # would like 1 2 1 2 3, ignore 0
[1] 1 2 1 0 1 2 3
Would changing sequence(nvec) to use seq_len(nvec[i])
instead of the current 1:nvec[i] break much existing code?

On the other hand, almost no one seems to use sequence()
and it might make more sense to allow seq_len() and seq()
to accept a vector for length.out and they would return a
vector of length sum(length.out),
c(seq_len(length.out[1]), seq_len(length.out[2]), ...)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] (PR#9810) Problem with careless user of RODBC (was SQL

2007-07-26 Thread ripley
Your error message was

>> d <- sqlFetch(channel, District)
> Error in odbcTableExists(channel, sqtable) :
>object "District" not found

and as you had not defined an object 'District' in that session, it seems 
perfectly plain.  If you want to refer to table "District" you have to 
give a character string (with quotes), not the name of an R object.

If all else fails, READ the documentation!  ?sqlFetch says

  sqtable: a database table name accessible from the connected dsn. This
   should be either a character string or a character vector of
   length 1.


I am glad you love R, but it _would_ be nice to get some credit for the 
package that you are using without apparently being aware that it is 
contributed work, instead of being inconvenienced clearing up after a 
non-bug report.


On Thu, 26 Jul 2007, [EMAIL PROTECTED] wrote:

> Full_Name: Jeff Lindon
> Version: 2.5.0
> OS: mingw32
> Submission from: (NULL) (63.147.8.67)
>
>
> R 2.5.0 seems to be unable to read valid tables from SQL Server 2005 with
> Service Pack 2 installed:
>
>> version
>   _
> platform   i386-pc-mingw32
> arch   i386
> os mingw32
> system i386, mingw32
> status
> major  2
> minor  5.0
> year   2007
> month  04
> day23
> svn rev41293
> language   R
> version.string R version 2.5.0 (2007-04-23)
>> library(RODBC)
>> channel <- odbcConnect("TLIAS01", uid="jeff.lindon")
>> channel
> RODB Connection 1
> Details:
>  case=nochange
>  DSN=TLIAS01
>  UID=jeff.lindon
>  Trusted_Connection=Yes
>  WSID=TLIJLINDON
>  DATABASE=tliresearch
>> d <- sqlFetch(channel, District)
> Error in odbcTableExists(channel, sqtable) :
>object "District" not found
>
> I have checked this problem with our CIO and he confirmed my Data Source
> configuration is correct (the connection test confirmed that R is able to
> connect to the database), and that the table really does exist and I have
> correct permissions (I work with it daily). Moreover, I was working between R
> and SQL Server 2005 with no problems before yesterday using the same exact set
> of instructions. The only change our CIO and I could think of is the recent
> installation of Service Pack 2.
>
> Unfortunately, reverting to Service Pack 1 is not currently an option, so I
> cannot be sure this is the problem. I am able to work around the issue by
> cutting and pasting the tables I need from SQL to Excel, then saving them as
> csv's. Saving directly from SQL to csv (highlighting the desired output and
> right-clicking) also causes problems for read.csv. I never tried that before
> Service Pack 2, though.
>
> I hope this information helps. I absolutely love R and thank you all so much 
> for
> your work on it!
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] SQL server service pack 2 prob? (PR#9810)

2007-07-26 Thread Simon Urbanek

On Jul 26, 2007, at 9:39 AM, [EMAIL PROTECTED] wrote:

> Full_Name: Jeff Lindon
> Version: 2.5.0
> OS: mingw32
> Submission from: (NULL) (63.147.8.67)
>
>
> R 2.5.0 seems to be unable to read valid tables from SQL Server  
> 2005 with
> Service Pack 2 installed:
>
>> version
>_
> platform   i386-pc-mingw32
> arch   i386
> os mingw32
> system i386, mingw32
> status
> major  2
> minor  5.0
> year   2007
> month  04
> day23
> svn rev41293
> language   R
> version.string R version 2.5.0 (2007-04-23)
>> library(RODBC)
>> channel <- odbcConnect("TLIAS01", uid="jeff.lindon")
>> channel
> RODB Connection 1
> Details:
>   case=nochange
>   DSN=TLIAS01
>   UID=jeff.lindon
>   Trusted_Connection=Yes
>   WSID=TLIJLINDON
>   DATABASE=tliresearch
>> d <- sqlFetch(channel, District)
> Error in odbcTableExists(channel, sqtable) :
> object "District" not found
>

Didn't you mean

d <- sqlFetch(channel, "District")

instead?

Cheers,
Simon

PS: This is not an R bug, so please don't abuse the R bug tracking  
system for this. If at all, the question should go to R-help or the  
RODBC maintainer (please read the posting guide).


> I have checked this problem with our CIO and he confirmed my Data  
> Source
> configuration is correct (the connection test confirmed that R is  
> able to
> connect to the database), and that the table really does exist and  
> I have
> correct permissions (I work with it daily). Moreover, I was working  
> between R
> and SQL Server 2005 with no problems before yesterday using the  
> same exact set
> of instructions. The only change our CIO and I could think of is  
> the recent
> installation of Service Pack 2.
>
> Unfortunately, reverting to Service Pack 1 is not currently an  
> option, so I
> cannot be sure this is the problem. I am able to work around the  
> issue by
> cutting and pasting the tables I need from SQL to Excel, then  
> saving them as
> csv's. Saving directly from SQL to csv (highlighting the desired  
> output and
> right-clicking) also causes problems for read.csv. I never tried  
> that before
> Service Pack 2, though.
>
> I hope this information helps. I absolutely love R and thank you  
> all so much for
> your work on it!
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] SQL server service pack 2 prob? (PR#9810)

2007-07-26 Thread jeff . lindon
Full_Name: Jeff Lindon
Version: 2.5.0
OS: mingw32
Submission from: (NULL) (63.147.8.67)


R 2.5.0 seems to be unable to read valid tables from SQL Server 2005 with
Service Pack 2 installed:

> version
   _   
platform   i386-pc-mingw32 
arch   i386
os mingw32 
system i386, mingw32   
status 
major  2   
minor  5.0 
year   2007
month  04  
day23  
svn rev41293   
language   R   
version.string R version 2.5.0 (2007-04-23)
> library(RODBC)
> channel <- odbcConnect("TLIAS01", uid="jeff.lindon")
> channel
RODB Connection 1
Details:
  case=nochange
  DSN=TLIAS01
  UID=jeff.lindon
  Trusted_Connection=Yes
  WSID=TLIJLINDON
  DATABASE=tliresearch
> d <- sqlFetch(channel, District)
Error in odbcTableExists(channel, sqtable) : 
object "District" not found

I have checked this problem with our CIO and he confirmed my Data Source
configuration is correct (the connection test confirmed that R is able to
connect to the database), and that the table really does exist and I have
correct permissions (I work with it daily). Moreover, I was working between R
and SQL Server 2005 with no problems before yesterday using the same exact set
of instructions. The only change our CIO and I could think of is the recent
installation of Service Pack 2. 

Unfortunately, reverting to Service Pack 1 is not currently an option, so I
cannot be sure this is the problem. I am able to work around the issue by
cutting and pasting the tables I need from SQL to Excel, then saving them as
csv's. Saving directly from SQL to csv (highlighting the desired output and
right-clicking) also causes problems for read.csv. I never tried that before
Service Pack 2, though.

I hope this information helps. I absolutely love R and thank you all so much for
your work on it!

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] loess prediction algorithm

2007-07-26 Thread Greg Snow
The loess.demo function in the TeachingDemos package may help you to
understand better what is happening (both running the demo and looking
at the code).  One common reason why predictions from the loess function
and hand computed predictions don't match is because the loess function
does an additional smoothing step by default, the above  demo shows both
curves (with the additional smoothing and without) so you can see how
close they are and how the smoothness differs.

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED]
> Sent: Wednesday, July 25, 2007 4:57 PM
> To: r-devel@r-project.org
> Subject: [Rd] loess prediction algorithm
> 
> 
> Hello,
> 
> I need help with the details of loess prediction algorithm.  
> I would like to get it implemented as a part of a measurement 
> system programmed in LabView.  My job is provide a detailed 
> description of the algorithm.  This is a simple 
> one-dimensional problem - smoothing an (x, y) data set.
> 
> I found quite a detailed description of the fitting procedure 
> in the "white book".  It is also described in great detail at 
> the NIST site in the Engineering Statistics Handbook.  It 
> provides an example of Loess computations.  I managed to 
> reproduce their example exactly in R.  At each data point I 
> compute a weighted local linear fit using the number of 
> points based of span.  Then I predict the values from these 
> local fits.  This matches R "loess" predictions exactly.
> 
> The problem starts when I try to predict at x values not in 
> the data set.
> The "white book" does not talk about predictions at all.  In 
> the NIST handbook in the "Final note on Loess Computations" 
> they mention this type of predictions but just say that the 
> same steps are used for predictions as for fitting.
> 
> When I try to use "the same steps" I get predictions that are 
> quite different that the predictions obtained by fitting R 
> loess model to a data set and running predict(, 
> newdata=).  They match quite well at the 
> lowest and highest ends of the x grid but in the middle are 
> different and much less smooth.  I can provide details but 
> basically what I do to create the predictions at x0 is this:
> 1.  I append c(x0, NA) to the data frame of (x, y) data.
> 2.  I calculate abs(xi-x0), i.e., absolute deviations of the 
> x values in the data set and a given x0 value.
> 3.  I sort the data set according to these deviations.  This 
> way the first row has the (x0, NA) value.
> 4.  I drop the first row.
> 5.  I divide all the deviations by the m-th one, where m is 
> the number of points used in local fitting -  m = 
> floor(n*span) where n is the number of points.
> 6.  I calculate the "tricube" weights and assign 0's to the 
> negative ones.
> This eliminates all the points except the m points of interest.
> 7.  I fit a linear weighted regression using lm.
> 8.  I predict y(x0) from this linear model.
> This is basically the same procedure I use to predict at the 
> x values from the data set, except for point 4.
> 
> I got the R sources for loess but it looks to me like most of 
> the work is done in a bunch of Fortran modules.  They are 
> very difficult to read and understand, especially since they 
> handle multiple x values.  A couple of things that worry me 
> are parameters in loess.control such as surface and cell.  
> They seem to have something to do with predictions but I do 
> not account for them in my simple procedure above.
> 
> Could anyone shed a light on this problem?  Any comment will 
> be appreciated.
> 
> I apologize in advance if this should have been posted in 
> r-help.  I figured that I have a better chance asking people 
> who read the r-devel group, since they are likely to know 
> more details about inner workings of R.
> 
> Thanks in advance,
> 
> Andy
> 
> __
> Andy Jaworski
> 518-1-01
> Process Laboratory
> 3M Corporate Research Laboratory
> -
> E-mail: [EMAIL PROTECTED]
> Tel:  (651) 733-6092
> Fax:  (651) 736-3122
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [R] aggregate.ts

2007-07-26 Thread Jeffrey J. Hallman
aggregate.tis() in the fame package does what I think is the right thing: 

> x2 <- tis(1:24, start = c(2000, 11), freq = 12)
> y2 <- aggregate(x2, nfreq = 4)
> x2
 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2000   1   2
2001   3   4   5   6   7   8   9  10  11  12  13  14
2002  15  16  17  18  19  20  21  22  23  24
class: tis
> y2
 Qtr1 Qtr2 Qtr3 Qtr4
2001   12   21   30   39
2002   48   57   66 
class: tis

If you really want y2 to have an observation for 2000Q4, you can use

> convert(x2, tif = "quarterly", observed = "summed", ignore = T)
  Qtr1  Qtr2  Qtr3  Qtr4
20004.03
2001 12.00 21.00 30.00 39.00
2002 48.00 57.00 66.00 71.225806
class: tis



Paul Gilbert <[EMAIL PROTECTED]> writes:

> I've been caught by this before, and complained before.  It does not do 
> what most people that work with economic time series would expect.  (One 
> might argue that not all time series are economic, but other time series 
> don't usually fit with ts very well.)  At the very least aggregate 
> should issue a warning.  Quarterly observations are for quarters of the 
> year, so just arbitrarily grouping in 3 beginning with the first 
> observation is *extremely* misleading, even if it is documented.
> 
> [ BTW, there is a bug in the print method here (R-2.5.1 on Linux) :
>  >   y2 <- aggregate(x2, nfreq = 4) 
>  >
>  > y2
> Error in rep.int("", start.pad) : invalid number of copies in rep.int()
>  > traceback()
> 5: rep.int("", start.pad)
> 4: as.vector(data)
> 3: matrix(c(rep.int("", start.pad), format(x, ...), rep.int("",
>end.pad)), nc = fr.x, byrow = TRUE, dimnames = list(dn1,
>dn2))
> 2: print.ts(c(6L, 15L, 24L, 33L, 42L, 51L, 60L, 69L))
> 1: print(c(6L, 15L, 24L, 33L, 42L, 51L, 60L, 69L))
> ]
> 
> 
> 
> >Currently, the "zoo" implementation allows this: Coercing back and forth
> >gives:
> >  library("zoo")
> >  z1 <- as.ts(aggregate(as.zoo(x1), as.yearqtr, sum))
> >  z2 <- as.ts(aggregate(as.zoo(x2), as.yearqtr, sum))
> >  
> >
> This is better, but still potentially misleading. I would prefer a 
> default  NA when only some of the observations are available for a 
> quarter  (and the syntax is a bit cumbersome for something one needs to 
> do fairly often).
> 
> Paul
> 
> >where z1 is identical to y1, and z2 is what you probably want.
> >
> >hth,
> >Z
> >
> >__
> >[EMAIL PROTECTED] mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
> >  
> >
> 
> 
> La version française suit le texte anglais.
> 
> 
> 
> This email may contain privileged and/or confidential inform...{{dropped}}
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

-- 
Jeff

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] loess prediction algorithm

2007-07-26 Thread Benjamin Tyner
Andy,

If you could provide an example of the R code with which you call
loess(), I can post R code which will duplicate what predict.loess
does without having to call the C/Fortran. There are a lot of
implementation details that are easy to overlook, but without knowing
the arguments to your call it is difficult to guess the source of your
problem.

Ben

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] loess prediction algorithm

2007-07-26 Thread Prof Brian Ripley
The R interface is just a wrapper for those Netlib C/Fortran functions.
I don't think anyone is going to be able (or willing) to read and explain 
those for you.

You do need to understand the loess.control parameters, and I believe they 
are explained in the White Book.  But perhaps you should use the simplest 
options in R as a baseline.

I don't believe your sketchy description of tricube weights is correct: 
the White Book has the details.

The default degree is 2, not linear fits.

On Wed, 25 Jul 2007, [EMAIL PROTECTED] wrote:

>
> Hello,
>
> I need help with the details of loess prediction algorithm.  I would like
> to get it implemented as a part of a measurement system programmed in
> LabView.  My job is provide a detailed description of the algorithm.  This
> is a simple one-dimensional problem - smoothing an (x, y) data set.
>
> I found quite a detailed description of the fitting procedure in the "white
> book".  It is also described in great detail at the NIST site in the
> Engineering Statistics Handbook.  It provides an example of Loess
> computations.  I managed to reproduce their example exactly in R.  At each
> data point I compute a weighted local linear fit using the number of points
> based of span.  Then I predict the values from these local fits.  This
> matches R "loess" predictions exactly.
>
> The problem starts when I try to predict at x values not in the data set.
> The "white book" does not talk about predictions at all.  In the NIST
> handbook in the "Final note on Loess Computations" they mention this type
> of predictions but just say that the same steps are used for predictions as
> for fitting.
>
> When I try to use "the same steps" I get predictions that are quite
> different that the predictions obtained by fitting R loess model to a data
> set and running predict(, newdata=).  They
> match quite well at the lowest and highest ends of the x grid but in the
> middle are different and much less smooth.  I can provide details but
> basically what I do to create the predictions at x0 is this:
> 1.  I append c(x0, NA) to the data frame of (x, y) data.
> 2.  I calculate abs(xi-x0), i.e., absolute deviations of the x values in
> the data set and a given x0 value.
> 3.  I sort the data set according to these deviations.  This way the first
> row has the (x0, NA) value.
> 4.  I drop the first row.
> 5.  I divide all the deviations by the m-th one, where m is the number of
> points used in local fitting -  m = floor(n*span) where n is the number of
> points.
> 6.  I calculate the "tricube" weights and assign 0's to the negative ones.
> This eliminates all the points except the m points of interest.
> 7.  I fit a linear weighted regression using lm.
> 8.  I predict y(x0) from this linear model.
> This is basically the same procedure I use to predict at the x values from
> the data set, except for point 4.
>
> I got the R sources for loess but it looks to me like most of the work is
> done in a bunch of Fortran modules.  They are very difficult to read and
> understand, especially since they handle multiple x values.  A couple of
> things that worry me are parameters in loess.control such as surface and
> cell.  They seem to have something to do with predictions but I do not
> account for them in my simple procedure above.
>
> Could anyone shed a light on this problem?  Any comment will be
> appreciated.
>
> I apologize in advance if this should have been posted in r-help.  I
> figured that I have a better chance asking people who read the r-devel
> group, since they are likely to know more details about inner workings of
> R.
>
> Thanks in advance,
>
> Andy
>
> __
> Andy Jaworski
> 518-1-01
> Process Laboratory
> 3M Corporate Research Laboratory
> -
> E-mail: [EMAIL PROTECTED]
> Tel:  (651) 733-6092
> Fax:  (651) 736-3122
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel