Re: [R] Singular design matrix in rq

2013-04-19 Thread Roger Koenker
Jonathan,

This is not what we call a reproducible example... what is raw_data?  Does it 
have something to do with mydata?
what is i? 

Roger

url:www.econ.uiuc.edu/~rogerRoger Koenker
emailrkoen...@uiuc.eduDepartment of Economics
vox: 217-333-4558University of Illinois
fax:   217-244-6678Urbana, IL 61801

On Apr 16, 2013, at 2:58 PM, Greenberg, Jonathan wrote:

 Quantreggers: 
 
 I'm trying to run rq() on a dataset I posted at:
 https://docs.google.com/file/d/0B8Kij67bij_ASUpfcmJ4LTFEUUk/edit?usp=sharing
 (it's a 1500kb csv file named singular.csv) and am getting the following 
 error:
 
 mydata - read.csv(singular.csv)
 fit_spl - rq(raw_data[,1] ~ bs(raw_data[,i],df=15),tau=1)
  Error in rq.fit.br(x, y, tau = tau, ...) : Singular design matrix
 
 Any ideas what might be causing this or, more importantly, suggestions for 
 how to solve this?  I'm just trying to fit a smoothed hull to the top of the 
 data cloud (hence the large df).
 
 Thanks!
 
 --jonathan
 
 
 -- 
 Jonathan A. Greenberg, PhD
 Assistant Professor
 Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
 Department of Geography and Geographic Information Science
 University of Illinois at Urbana-Champaign
 607 South Mathews Avenue, MC 150
 Urbana, IL 61801
 Phone: 217-300-1924
 http://www.geog.illinois.edu/~jgrn/
 AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Singular design matrix in rq

2013-04-19 Thread Jonathan Greenberg
Roger:

Doh!  Just realized I had that error in the code -- raw_data is the same as
mydata, so it should be:

mydata - read.csv(singular.csv)
plot(mydata$predictor,mydata$response)
# A big cloud of points, nothing too weird
summary(mydata)
# No NAs:

#   Xresponse predictor
# Min.   :1   Min.   :0.0   Min.   : 0.000
# 1st Qu.:12726   1st Qu.:  851.2   1st Qu.: 0.000
# Median :25452   Median : 2737.0   Median : 0.000
# Mean   :25452   Mean   : 3478.0   Mean   : 5.532
# 3rd Qu.:38178   3rd Qu.: 5111.6   3rd Qu.: 5.652
# Max.   :50903   Max.   :26677.8   Max.   :69.342

fit_spl - rq(response ~ bs(predictor,df=15),tau=1,data=mydata)
# Error in rq.fit.br(x, y, tau = tau, ...) : Singular design matrix

--j



On Fri, Apr 19, 2013 at 8:15 AM, Koenker, Roger W rkoen...@illinois.eduwrote:

 Jonathan,

 This is not what we call a reproducible example... what is raw_data?  Does
 it have something to do with mydata?
 what is i?

 Roger

 url:www.econ.uiuc.edu/~rogerRoger Koenker
 emailrkoen...@uiuc.eduDepartment of Economics
 vox: 217-333-4558University of Illinois
 fax:   217-244-6678Urbana, IL 61801

 On Apr 16, 2013, at 2:58 PM, Greenberg, Jonathan wrote:

  Quantreggers:
 
  I'm trying to run rq() on a dataset I posted at:
 
 https://docs.google.com/file/d/0B8Kij67bij_ASUpfcmJ4LTFEUUk/edit?usp=sharing
  (it's a 1500kb csv file named singular.csv) and am getting the
 following error:
 
  mydata - read.csv(singular.csv)
  fit_spl - rq(raw_data[,1] ~ bs(raw_data[,i],df=15),tau=1)
   Error in rq.fit.br(x, y, tau = tau, ...) : Singular design matrix
 
  Any ideas what might be causing this or, more importantly, suggestions
 for how to solve this?  I'm just trying to fit a smoothed hull to the top
 of the data cloud (hence the large df).
 
  Thanks!
 
  --jonathan
 
 
  --
  Jonathan A. Greenberg, PhD
  Assistant Professor
  Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
  Department of Geography and Geographic Information Science
  University of Illinois at Urbana-Champaign
  607 South Mathews Avenue, MC 150
  Urbana, IL 61801
  Phone: 217-300-1924
  http://www.geog.illinois.edu/~jgrn/
  AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007




-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Singular design matrix in rq

2013-04-19 Thread William Dunlap
I believe that those repeated values (more than half your x values are 0.0)
are causing bs() problems, because its default knots are at quantiles of the 
data
at equally spaced probabilities.  The following may be the same problem:

 set.seed(1)
 x - c(rep(0, 20), 1:15)
 y - sort(rnorm(length(x)))
 rq(y~bs(x, df=15), tau=.5)
Error in rq.fit.br(x, y, tau = tau, ...) : Singular design matrix
 # lm deals with a singular design matrix by dropping columns from the model
 lm(y~bs(x, df=15))

Call:
lm(formula = y ~ bs(x, df = 15))

Coefficients:
 (Intercept)   bs(x, df = 15)1   bs(x, df = 15)2   bs(x, df = 15)3  
 1.59024NANANA  
 bs(x, df = 15)4   bs(x, df = 15)5   bs(x, df = 15)6   bs(x, df = 15)7  
  NANANA  -2.09983  
 bs(x, df = 15)8   bs(x, df = 15)9  bs(x, df = 15)10  bs(x, df = 15)11  
-1.06874  -1.20798  -0.99340  -0.87365  
bs(x, df = 15)12  bs(x, df = 15)13  bs(x, df = 15)14  bs(x, df = 15)15  
-0.71927  -0.50564   0.06184NA  

 svd(cbind(1, bs(x, df=15)))$d # design matrix is not full rank
 [1] 7.029298e+00 2.773759e+00 1.286165e+00 1.160239e+00 9.992134e-01 
8.102012e-01
 [7] 6.334326e-01 4.098332e-01 3.185013e-01 4.476983e-16 1.643202e-16 
8.614772e-17
[13] 7.597613e-17 5.575475e-17 1.760443e-17 1.727013e-18

Try using equally spaced knots or removing repeated quantiles when you call 
bs().

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Jonathan Greenberg
 Sent: Friday, April 19, 2013 6:29 AM
 To: Koenker, Roger W
 Cc: r-help
 Subject: Re: [R] Singular design matrix in rq
 
 Roger:
 
 Doh!  Just realized I had that error in the code -- raw_data is the same as
 mydata, so it should be:
 
 mydata - read.csv(singular.csv)
 plot(mydata$predictor,mydata$response)
 # A big cloud of points, nothing too weird
 summary(mydata)
 # No NAs:
 
 #   Xresponse predictor
 # Min.   :1   Min.   :0.0   Min.   : 0.000
 # 1st Qu.:12726   1st Qu.:  851.2   1st Qu.: 0.000
 # Median :25452   Median : 2737.0   Median : 0.000
 # Mean   :25452   Mean   : 3478.0   Mean   : 5.532
 # 3rd Qu.:38178   3rd Qu.: 5111.6   3rd Qu.: 5.652
 # Max.   :50903   Max.   :26677.8   Max.   :69.342
 
 fit_spl - rq(response ~ bs(predictor,df=15),tau=1,data=mydata)
 # Error in rq.fit.br(x, y, tau = tau, ...) : Singular design matrix
 
 --j
 
 
 
 On Fri, Apr 19, 2013 at 8:15 AM, Koenker, Roger W 
 rkoen...@illinois.eduwrote:
 
  Jonathan,
 
  This is not what we call a reproducible example... what is raw_data?  Does
  it have something to do with mydata?
  what is i?
 
  Roger
 
  url:www.econ.uiuc.edu/~rogerRoger Koenker
  emailrkoen...@uiuc.eduDepartment of Economics
  vox: 217-333-4558University of Illinois
  fax:   217-244-6678Urbana, IL 61801
 
  On Apr 16, 2013, at 2:58 PM, Greenberg, Jonathan wrote:
 
   Quantreggers:
  
   I'm trying to run rq() on a dataset I posted at:
  
  https://docs.google.com/file/d/0B8Kij67bij_ASUpfcmJ4LTFEUUk/edit?usp=sharing
   (it's a 1500kb csv file named singular.csv) and am getting the
  following error:
  
   mydata - read.csv(singular.csv)
   fit_spl - rq(raw_data[,1] ~ bs(raw_data[,i],df=15),tau=1)
Error in rq.fit.br(x, y, tau = tau, ...) : Singular design matrix
  
   Any ideas what might be causing this or, more importantly, suggestions
  for how to solve this?  I'm just trying to fit a smoothed hull to the top
  of the data cloud (hence the large df).
  
   Thanks!
  
   --jonathan
  
  
   --
   Jonathan A. Greenberg, PhD
   Assistant Professor
   Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
   Department of Geography and Geographic Information Science
   University of Illinois at Urbana-Champaign
   607 South Mathews Avenue, MC 150
   Urbana, IL 61801
   Phone: 217-300-1924
   http://www.geog.illinois.edu/~jgrn/
   AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007
 
 
 
 
 --
 Jonathan A. Greenberg, PhD
 Assistant Professor
 Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
 Department of Geography and Geographic Information Science
 University of Illinois at Urbana-Champaign
 607 South Mathews Avenue, MC 150
 Urbana, IL 61801
 Phone: 217-300-1924
 http://www.geog.illinois.edu/~jgrn/
 AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R

Re: [R] Singular design matrix in rq

2013-04-18 Thread William Dunlap
Do you know that there are NaN's in the output of bs(raw_data[,i],df=15)?
any(is.nan(bs(raw_data[,i],df=15))) would tell you.  Do you know that there
are fewer than c. 18 distinct values in raw_data[,i]?  
length(unique(raw_data[,i]))
would tell you.  If there are not very many distinct values then use fewer 
degrees
of freedom in bs().

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

From: jgrn...@gmail.com [mailto:jgrn...@gmail.com] On Behalf Of Jonathan 
Greenberg
Sent: Thursday, April 18, 2013 6:50 AM
To: William Dunlap
Subject: Re: [R] Singular design matrix in rq

William:

Thanks!  Given that I'm just trying to drape a sheet on top of the data, can 
you recommend a better smoother to use?

--j

On Tue, Apr 16, 2013 at 4:40 PM, William Dunlap 
wdun...@tibco.commailto:wdun...@tibco.com wrote:
Have you looked at the result of
  bs(raw_data[,i], df=15)
?  If there are not many unique values in the input there
will be a lot of NaN's in the output (because there are
repeated knots) and those NaN's will cause rq() to give
that message.

E.g.,
 d - data.frame(y=sin(1:100), x4=rep(1:4,each=25), x50=rep(1:50,each=2))
 rq(data=d, y ~ bs(x4, df=15), tau=.8) # using x50 works
Error in rq.fit.brhttp://rq.fit.br(x, y, tau = tau, ...) : Singular design 
matrix
 with(d, bs(x4, df=15))
   1 2 3 4 5 6 7 8 9 10 11  12  13  14  15
  [1,] 0 0 1 0 0 0 0 0 0  0  0   0   0   0   0
  [2,] 0 0 1 0 0 0 0 0 0  0  0   0   0   0   0
  [3,] 0 0 1 0 0 0 0 0 0  0  0   0   0   0   0
  ...
[98,] 0 0 0 0 0 0 0 0 0  0  0 NaN NaN NaN NaN
 [99,] 0 0 0 0 0 0 0 0 0  0  0 NaN NaN NaN NaN
[100,] 0 0 0 0 0 0 0 0 0  0  0 NaN NaN NaN NaN
attr(,degree)
[1] 3
attr(,knots)
7.692308% 15.38462% 23.07692% 30.76923% 38.46154%
1 1 1 2 2
46.15385% 53.84615% 61.53846% 69.23077% 76.92308%
2 3 3 3 4
84.61538% 92.30769%
4 4
attr(,Boundary.knots)
[1] 1 4
attr(,intercept)
[1] FALSE
attr(,class)
[1] bs basis  matrix

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.comhttp://tibco.com


 -Original Message-
 From: r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Jonathan Greenberg
 Sent: Tuesday, April 16, 2013 12:58 PM
 To: r-help; Roger Koenker
 Subject: [R] Singular design matrix in rq

 Quantreggers:

 I'm trying to run rq() on a dataset I posted at:
 https://docs.google.com/file/d/0B8Kij67bij_ASUpfcmJ4LTFEUUk/edit?usp=sharing
 (it's a 1500kb csv file named singular.csv) and am getting the following
 error:

 mydata - read.csv(singular.csv)
 fit_spl - rq(raw_data[,1] ~ bs(raw_data[,i],df=15),tau=1)
  Error in rq.fit.brhttp://rq.fit.br(x, y, tau = tau, ...) : Singular 
  design matrix

 Any ideas what might be causing this or, more importantly, suggestions for
 how to solve this?  I'm just trying to fit a smoothed hull to the top of
 the data cloud (hence the large df).

 Thanks!

 --jonathan


 --
 Jonathan A. Greenberg, PhD
 Assistant Professor
 Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
 Department of Geography and Geographic Information Science
 University of Illinois at Urbana-Champaign
 607 South Mathews Avenue, MC 150
 Urbana, IL 61801
 Phone: 217-300-1924tel:217-300-1924
 http://www.geog.illinois.edu/~jgrn/
 AIM: jgrn307, MSN: jgrn...@hotmail.commailto:jgrn...@hotmail.com, Gchat: 
 jgrn307, Skype: jgrn3007

   [[alternative HTML version deleted]]

 __
 R-help@r-project.orgmailto:R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



--
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.commailto:jgrn...@hotmail.com, Gchat: 
jgrn307, Skype: jgrn3007

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Singular design matrix in rq

2013-04-16 Thread William Dunlap
Have you looked at the result of
  bs(raw_data[,i], df=15)
?  If there are not many unique values in the input there
will be a lot of NaN's in the output (because there are
repeated knots) and those NaN's will cause rq() to give
that message.

E.g.,
 d - data.frame(y=sin(1:100), x4=rep(1:4,each=25), x50=rep(1:50,each=2))
 rq(data=d, y ~ bs(x4, df=15), tau=.8) # using x50 works
Error in rq.fit.br(x, y, tau = tau, ...) : Singular design matrix
 with(d, bs(x4, df=15))
   1 2 3 4 5 6 7 8 9 10 11  12  13  14  15
  [1,] 0 0 1 0 0 0 0 0 0  0  0   0   0   0   0
  [2,] 0 0 1 0 0 0 0 0 0  0  0   0   0   0   0
  [3,] 0 0 1 0 0 0 0 0 0  0  0   0   0   0   0
  ...
[98,] 0 0 0 0 0 0 0 0 0  0  0 NaN NaN NaN NaN
 [99,] 0 0 0 0 0 0 0 0 0  0  0 NaN NaN NaN NaN
[100,] 0 0 0 0 0 0 0 0 0  0  0 NaN NaN NaN NaN
attr(,degree)
[1] 3
attr(,knots)
7.692308% 15.38462% 23.07692% 30.76923% 38.46154% 
1 1 1 2 2 
46.15385% 53.84615% 61.53846% 69.23077% 76.92308% 
2 3 3 3 4 
84.61538% 92.30769% 
4 4 
attr(,Boundary.knots)
[1] 1 4
attr(,intercept)
[1] FALSE
attr(,class)
[1] bs basis  matrix

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Jonathan Greenberg
 Sent: Tuesday, April 16, 2013 12:58 PM
 To: r-help; Roger Koenker
 Subject: [R] Singular design matrix in rq
 
 Quantreggers:
 
 I'm trying to run rq() on a dataset I posted at:
 https://docs.google.com/file/d/0B8Kij67bij_ASUpfcmJ4LTFEUUk/edit?usp=sharing
 (it's a 1500kb csv file named singular.csv) and am getting the following
 error:
 
 mydata - read.csv(singular.csv)
 fit_spl - rq(raw_data[,1] ~ bs(raw_data[,i],df=15),tau=1)
  Error in rq.fit.br(x, y, tau = tau, ...) : Singular design matrix
 
 Any ideas what might be causing this or, more importantly, suggestions for
 how to solve this?  I'm just trying to fit a smoothed hull to the top of
 the data cloud (hence the large df).
 
 Thanks!
 
 --jonathan
 
 
 --
 Jonathan A. Greenberg, PhD
 Assistant Professor
 Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
 Department of Geography and Geographic Information Science
 University of Illinois at Urbana-Champaign
 607 South Mathews Avenue, MC 150
 Urbana, IL 61801
 Phone: 217-300-1924
 http://www.geog.illinois.edu/~jgrn/
 AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.