[R] For Hadley Wickham: Need for a small fix in haven::read_spss

2015-07-20 Thread Dimitri Liakhovitski
Hadley,

you've added function labelled to haven, which is great. However, when
it so happens that in SPSS a variable has no long label, your code
considers it to be NULL rather than an NA. NULL is correct, but NA
would probably be better.

For example, I've read in an SPSS file:

library(haven)
spss1 - read_spss(SPSS_Example.sav)

varnames - names(spss1)
mylabels - unlist(lapply(spss1, attr, label))

length(varnames)
[1] 64

length(mylabels)
[1] 62


Because in this particular dataset there were 2 variables without
either variable labels or data labels.
When I run lapply(spss1, attr, label) I see under those 2 variables
NULL - which is true and valid.
However,  would it be possible to have instead of NULL an NA? This way
the length of varnames and mylables would the same and one could put
them side by side (e.g., in one data frame)?


Thanks a lot!

-- 
Dimitri Liakhovitski

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] For Hadley Wickham: Need for a small fix in haven::read_spss

2015-07-20 Thread Hadley Wickham
(FWIW this would've been better send to me directly or filed on
github, rather than sent to R-help)

I think this is more of a problem with the way that you're accessing
the info, than the design of the underlying structure. I'd do
something like this:

attr_default - function(x, which, default) {
  val - attr(x, which)
  if (is.null(val)) default else val
}

sapply(spss1, attr_default, label, NA_character_)

(code untested, but you get the idea)

Hadley

On Mon, Jul 20, 2015 at 8:56 AM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:
 Hadley,

 you've added function labelled to haven, which is great. However, when
 it so happens that in SPSS a variable has no long label, your code
 considers it to be NULL rather than an NA. NULL is correct, but NA
 would probably be better.

 For example, I've read in an SPSS file:

 library(haven)
 spss1 - read_spss(SPSS_Example.sav)

 varnames - names(spss1)
 mylabels - unlist(lapply(spss1, attr, label))

 length(varnames)
 [1] 64

 length(mylabels)
 [1] 62


 Because in this particular dataset there were 2 variables without
 either variable labels or data labels.
 When I run lapply(spss1, attr, label) I see under those 2 variables
 NULL - which is true and valid.
 However,  would it be possible to have instead of NULL an NA? This way
 the length of varnames and mylables would the same and one could put
 them side by side (e.g., in one data frame)?


 Thanks a lot!

 --
 Dimitri Liakhovitski

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
http://had.co.nz/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] For Hadley Wickham: Need for a small fix in haven::read_spss

2015-07-20 Thread Dimitri Liakhovitski
Thank you, Hadley. Yes, you are right - next time I'll email you directly.

On Mon, Jul 20, 2015 at 10:01 AM, Hadley Wickham h.wick...@gmail.com wrote:
 (FWIW this would've been better send to me directly or filed on
 github, rather than sent to R-help)

 I think this is more of a problem with the way that you're accessing
 the info, than the design of the underlying structure. I'd do
 something like this:

 attr_default - function(x, which, default) {
   val - attr(x, which)
   if (is.null(val)) default else val
 }

 sapply(spss1, attr_default, label, NA_character_)

 (code untested, but you get the idea)

 Hadley

 On Mon, Jul 20, 2015 at 8:56 AM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:
 Hadley,

 you've added function labelled to haven, which is great. However, when
 it so happens that in SPSS a variable has no long label, your code
 considers it to be NULL rather than an NA. NULL is correct, but NA
 would probably be better.

 For example, I've read in an SPSS file:

 library(haven)
 spss1 - read_spss(SPSS_Example.sav)

 varnames - names(spss1)
 mylabels - unlist(lapply(spss1, attr, label))

 length(varnames)
 [1] 64

 length(mylabels)
 [1] 62


 Because in this particular dataset there were 2 variables without
 either variable labels or data labels.
 When I run lapply(spss1, attr, label) I see under those 2 variables
 NULL - which is true and valid.
 However,  would it be possible to have instead of NULL an NA? This way
 the length of varnames and mylables would the same and one could put
 them side by side (e.g., in one data frame)?


 Thanks a lot!

 --
 Dimitri Liakhovitski

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 http://had.co.nz/



-- 
Dimitri Liakhovitski

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.