Re: [R] Factor levels in training set

2016-06-15 Thread PIKAL Petr
Hi Elahe

I get slightly different error when using scale to nonnumeric data so I am not 
sure if you use the scale function from base package.

> scale(raman[1:20,])
Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric

Anyway, how do you expect scaling shall be done when you have nonumeric 
variable. What shall be the output of

scale(iris$Species)

The only workaround is either to scale only numeric variables from your data 
and add nonnumeric in folowing step or to change all factor variable to numeric 
before scaling (which I would not recommend).

If your data are supposed to be numeric you can check if they really are by

str(df)

Cheers
Petr

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of ch.elahe
> via R-help
> Sent: Tuesday, June 14, 2016 5:29 PM
> To: R-help Mailing List <r-help@r-project.org>
> Subject: [R] Factor levels in training set
>
>
>  Hi all,
> I want to use Supervised Self organizing Maps from Kohonen package for my
> data. I need to divide my df into training set and test set, but a part of my 
> df
> contains column with factor levels and I don't know how to bring them into
> my training set. Currently I use the following command for my training set:
>
> dt=sort(sample(nrow(df),nrow(df)*.7))
> training=m[dt,]
> till here I get no error but in the next step which I need to bring my 
> training
> set in a matrix I face this error:
>
> scale(df[training,])
> error: 'x' should be numeric
> Does anyone know how should I include column with factor levels in my df so
> that I don't get this error?
> Thanks for any help,
> Elahe
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.


Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny 
pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či 
zpožděním přenosu e-mailu.

V případě, že je tento e-mail součástí obchodního jednání:
- vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a 
to z jakéhokoliv důvodu i bez uvedení důvodu.
- a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce 
s dodatkem či odchylkou.
- trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným 
dosažením shody na všech jejích náležitostech.
- odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost 
žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně 
pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně 
osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi 
či osobě jím zastoupené známá.

This e-mail and any documents attached to it may be confidential and are 
intended only for its intended recipients.
If you received this e-mail by mistake, please immediately inform its sender. 
Delete the contents of this e-mail with all attachments and its copies from 
your system.
If you are not the intended recipient of this e-mail, you are not authorized to 
use, disseminate, copy or disclose this e-mail in any manner.
The sender of this e-mail shall not be liable for any possible damage caused by 
modifications of the e-mail or by delay with transfer of the email.

In case that this e-mail forms part of business dealings:
- the sender reserves the right to end negotiations about entering into a 
contract in any time, for any reason, and without stating any reasoning.
- if the e-mail contains an offer, the recipient is entitled to immediately 
accept such offer; The sender of this e-mail (offer) excludes any acceptance of 
the offer on the part of the recipient containing any amendment or variation.
- the sender insists on that the respective contract is concluded only upon an 
express mutual agreement on all its aspects.
- the sender of this e-mail informs that he/she is not authorized to enter into 
any contracts on behalf of the company except for cases in which he/she is 
expressly authorized to do so in writing, and such authorization or power of 
attorney is submitted to the recipient or the person represented by the 
recipient, or the existence of such authorization is known to the recipient of 
the person

[R] Factor levels in training set

2016-06-14 Thread ch.elahe via R-help
Hi all, 
I want to use Supervised Self organizing Maps from Kohonen package for my data. 
I need to divide my df into training set and test set, but a part of my df 
contains column with factor levels and I don't know how to bring them into my 
training set. Currently I use the following command for my training set:
 
dt=sort(sample(nrow(df),nrow(df)*.7))
training=m[dt,]
till here I get no error but in the next step which I need to bring my training 
set in a matrix I face this error:

scale(df[training,])
error: 'x' should be numeric
Does anyone know how should I include column with factor levels in my df so 
that I don't get this error?
Thanks for any help,
Elahe

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Factor levels in training set

2016-06-14 Thread ch.elahe via R-help

 Hi all, 
I want to use Supervised Self organizing Maps from Kohonen package for my data. 
I need to divide my df into training set and test set, but a part of my df 
contains column with factor levels and I don't know how to bring them into my 
training set. Currently I use the following command for my training set:

dt=sort(sample(nrow(df),nrow(df)*.7))
training=m[dt,]
till here I get no error but in the next step which I need to bring my training 
set in a matrix I face this error:

scale(df[training,])
error: 'x' should be numeric
Does anyone know how should I include column with factor levels in my df so 
that I don't get this error?
Thanks for any help,
Elahe

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] factor levels numeric values

2015-02-11 Thread JS Huang
Hi,

  Suppose your data frame is called data and the name of the factor column
is named tobeConverted.  I have tried this and it worked.  Hope this helps.

 as.numeric(as.character(data$tobeConverted))



--
View this message in context: 
http://r.789695.n4.nabble.com/factor-levels-numeric-values-tp4699515p4703090.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] factor levels numeric values

2014-11-12 Thread David Studer
Hi everybody,

I have another question (to which I could not find an answer in my r-books.
I am sure, it's not a great issue, but I simply lack of a good idea how to
solve this:

One of my variables gets imported as a factor instead of a numeric variable.
Now I have a...
 Factor w/ 63 levels 0,0.02,0.03,..: 1 NA NA 1 NA NA 1 1 53 10 ...

How can I transform these factor levels into actual values?

Thank you very much for any help!
David

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] factor levels numeric values

2014-11-12 Thread Gerrit Eichner

Hello, David,

take a look at the beginning of the Warning section of ?factor.

 Hth  --  Gerrit


Hi everybody,

I have another question (to which I could not find an answer in my r-books.
I am sure, it's not a great issue, but I simply lack of a good idea how to
solve this:

One of my variables gets imported as a factor instead of a numeric variable.
Now I have a...
Factor w/ 63 levels 0,0.02,0.03,..: 1 NA NA 1 NA NA 1 1 53 10 ...

How can I transform these factor levels into actual values?

Thank you very much for any help!
David

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] factor levels numeric values

2014-11-12 Thread David L Carlson
Also look at the Frequently Asked Questions document that comes with your R 
installation:

7.10 How do I convert factors to numeric?

It may happen that when reading numeric data into R (usually, when reading in a 
file), they come in as factors. If f is such a factor object, you can use

as.numeric(as.character(f))

to get the numbers back. More efficient, but harder to remember, is

as.numeric(levels(f))[as.integer(f)]

In any case, do not call as.numeric() or their likes directly for the task at 
hand (as as.numeric() or unclass() give the internal codes).

-
David L Carlson
Department of Anthropology
Texas AM University
College Station, TX 77840-4352

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Gerrit Eichner
Sent: Wednesday, November 12, 2014 8:06 AM
To: David Studer
Cc: r-help@r-project.org
Subject: Re: [R] factor levels  numeric values

Hello, David,

take a look at the beginning of the Warning section of ?factor.

  Hth  --  Gerrit

 Hi everybody,

 I have another question (to which I could not find an answer in my r-books.
 I am sure, it's not a great issue, but I simply lack of a good idea how to
 solve this:

 One of my variables gets imported as a factor instead of a numeric variable.
 Now I have a...
 Factor w/ 63 levels 0,0.02,0.03,..: 1 NA NA 1 NA NA 1 1 53 10 ...

 How can I transform these factor levels into actual values?

 Thank you very much for any help!
 David

   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] factor levels numeric values

2014-11-12 Thread Ivan Calandra
I have not completely followed the discussion, so excuse me if it was 
already pointed out.
If numeric data are read as factors, this means that there are not only 
numeric data in the column. It could be an empty space somewhere, or 
some character that should be NA, or...
I think it is worth spending some time searching for the typo so that 
the file will be read correctly in R.


HTH,
Ivan

--
Ivan Calandra, ATER
University of Reims Champagne-Ardenne
GEGENA² - EA 3795
CREA - 2 esplanade Roland Garros
51100 Reims, France
+33(0)3 26 77 36 89
ivan.calan...@univ-reims.fr
https://www.researchgate.net/profile/Ivan_Calandra

Le 12/11/14 15:56, David L Carlson a écrit :

Also look at the Frequently Asked Questions document that comes with your R 
installation:

7.10 How do I convert factors to numeric?

It may happen that when reading numeric data into R (usually, when reading in a 
file), they come in as factors. If f is such a factor object, you can use

as.numeric(as.character(f))

to get the numbers back. More efficient, but harder to remember, is

as.numeric(levels(f))[as.integer(f)]

In any case, do not call as.numeric() or their likes directly for the task at 
hand (as as.numeric() or unclass() give the internal codes).

-
David L Carlson
Department of Anthropology
Texas AM University
College Station, TX 77840-4352

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Gerrit Eichner
Sent: Wednesday, November 12, 2014 8:06 AM
To: David Studer
Cc: r-help@r-project.org
Subject: Re: [R] factor levels  numeric values

Hello, David,

take a look at the beginning of the Warning section of ?factor.

   Hth  --  Gerrit


Hi everybody,

I have another question (to which I could not find an answer in my r-books.
I am sure, it's not a great issue, but I simply lack of a good idea how to
solve this:

One of my variables gets imported as a factor instead of a numeric variable.
Now I have a...
Factor w/ 63 levels 0,0.02,0.03,..: 1 NA NA 1 NA NA 1 1 53 10 ...

How can I transform these factor levels into actual values?

Thank you very much for any help!
David

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] factor levels numeric values

2014-11-12 Thread Jeff Newmiller
Another approach is to re-import your data using options that do not put the 
data into a factor in the first place.  For example you can use the colClasses 
parameter in the read.table family of functions to specify numeric for that 
column. If you need to give special handling to that column anyway (using 
strong functions) then you can use the stringsAsFactors=FALSE or as.is=TRUE 
parameter settings and avoid the as.character() band-aid in your code.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On November 12, 2014 6:56:11 AM PST, David L Carlson dcarl...@tamu.edu wrote:
Also look at the Frequently Asked Questions document that comes with
your R installation:

7.10 How do I convert factors to numeric?

It may happen that when reading numeric data into R (usually, when
reading in a file), they come in as factors. If f is such a factor
object, you can use

as.numeric(as.character(f))

to get the numbers back. More efficient, but harder to remember, is

as.numeric(levels(f))[as.integer(f)]

In any case, do not call as.numeric() or their likes directly for the
task at hand (as as.numeric() or unclass() give the internal codes).

-
David L Carlson
Department of Anthropology
Texas AM University
College Station, TX 77840-4352

-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org] On Behalf Of Gerrit Eichner
Sent: Wednesday, November 12, 2014 8:06 AM
To: David Studer
Cc: r-help@r-project.org
Subject: Re: [R] factor levels  numeric values

Hello, David,

take a look at the beginning of the Warning section of ?factor.

  Hth  --  Gerrit

 Hi everybody,

 I have another question (to which I could not find an answer in my
r-books.
 I am sure, it's not a great issue, but I simply lack of a good idea
how to
 solve this:

 One of my variables gets imported as a factor instead of a numeric
variable.
 Now I have a...
 Factor w/ 63 levels 0,0.02,0.03,..: 1 NA NA 1 NA NA 1 1 53 10
...

 How can I transform these factor levels into actual values?

 Thank you very much for any help!
 David

  [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Factor levels.

2007-10-03 Thread Rolf Turner

On 3/10/2007, at 5:48 PM, Peter Dalgaard wrote:

 Rolf Turner wrote:
 I have factors with levels ``Unit, Achieved, and Scholarship;  
 I  wish to replace these with
 U, A, and S.

 So I do

  fff - factor(fff,labels=c(U,A,S))

 This works as long as all of the levels are actually present in  
 the  factor.  But if ``Scholarship'' is absent
 (as if often is) then I get an error.

 I can do a workaround such as

  fff - factor(c(U,A,S)[fff],levels=c(U,A,S))

 but this seems kludgy to me.

 Does it even work? (What if it is the first or the 2nd level that  
 is absent?

Yes it works.  What's the problem?

To beat it to death:  if the second level of fff is absent then fff  
will consist entirely of 1's and 3's,
and so c(U,A,S)[fff] will consist entirely of U's and S's.  I  
can then set the levels to be
c(U,A,S) and get what I want.

Note that if I just did

fff - factor(c(U,A,S)[fff])

in these circumstances, then I would get a factor whose levels were c 
(U,S) which is NOT what I want.
(I.e. I want the levels always to be c(U,A,S) irrespective of  
what levels are actually present in the factor.)

 The canonical way is

 factor(fff, levels=c(Unit, Achieved, Scholarship), labels=c 
 (U,A,S))

Right.  That is indeed sexier.  Thanks.

cheers,

Rolf

##
Attention:\ This e-mail message is privileged and confidenti...{{dropped}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Factor levels.

2007-10-03 Thread Rolf Turner

On 4/10/2007, at 7:50 AM, Peter Dalgaard wrote:

 Rolf Turner wrote:
 Does it even work? (What if it is the first or the 2nd level that  
 is absent?

 Yes it works.  What's the problem?

 To beat it to death:  if the second level of fff is absent  
 then fff will consist entirely of 1's and 3's,
 and so c(U,A,S)[fff] will consist entirely of U's and  
 S's.  I can then set the levels to be
 c(U,A,S) and get what I want.
 You didn't say that fff was numeric.

All factors are numeric.
 If fff is a factor, then we have the problem:

  attach(read.table(stdin(),header=T))
 0: fff
 1: Unit
 2: Scholarship
 3: Scholarship
 4: Unit
 5:
  c(U,A,S)[fff]
 [1] A U U A

My original fff is a factor with levels c 
(Unit,Achievement,Scholarship).  If you make that
adjustment, you get the ``right answer''.
 Actually we have another problem too, namely sort order

No, there is no sort order problem.  See above.

[Given that the original fff has its levels in the order specified  
then Unit maps to U,
Achievement to A, and Scholarship to S.]

cheers,

Rolf

P.S.  ***Are*** there any risks/dangers in following Christos  
Hatzis' suggestion of simply doing

levels(fff) - c(U,A,S)   ???

##
Attention:\ This e-mail message is privileged and confidenti...{{dropped}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Factor levels.

2007-10-03 Thread Peter Dalgaard
Rolf Turner wrote:
 P.S.  ***Are*** there any risks/dangers in following Christos 
 Hatzis' suggestion of simply doing

 levels(fff) - c(U,A,S)   ???
Not if the levels are right to begin with.

Problems only arise if fff somehow becomes a two-level factor, e.g. if 
you do
   fff - fff[2:3, drop=TRUE].
Then you can get this effect:

   fff
[1] UnitScholarship Scholarship Unit  
Levels: Scholarship Unit
  fff[2:3, drop=TRUE]
[1] Scholarship Scholarship
Levels: Scholarship
  fff[-(2:3), drop=TRUE]
[1] Unit Unit
Levels: Unit
  ggg - fff[2:3, drop=TRUE]
  levels(ggg) - c(a,b)
  ggg
[1] a a
Levels: a b
  ggg - fff[-(2:3), drop=TRUE]
  levels(ggg) - c(a,b)
  ggg
[1] a a
Levels: a b


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Factor levels.

2007-10-02 Thread Rolf Turner

I have factors with levels ``Unit, Achieved, and Scholarship; I  
wish to replace these with
U, A, and S.

So I do

fff - factor(fff,labels=c(U,A,S))

This works as long as all of the levels are actually present in the  
factor.  But if ``Scholarship'' is absent
(as if often is) then I get an error.

I can do a workaround such as

fff - factor(c(U,A,S)[fff],levels=c(U,A,S))

but this seems kludgy to me.

Is there a sexier way?

cheers,

Rolf Turner


##
Attention:\ This e-mail message is privileged and confidenti...{{dropped}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Factor levels

2007-09-19 Thread Gabor Grothendieck
If you don't know ahead of time how many columns you have and
only that they are a mix of numeric and character (to be converted to
factor) then you can do this:

DF - read.table(textConnection(Input), header = TRUE, as.is = TRUE)
f - function(x) if (is.character(x)) factor(x, levels = unique(x)) else x
DF[] - lapply(DF, f)
DF





On 9/19/07, Sébastien [EMAIL PROTECTED] wrote:
 Hi Gabor,

 I am coming back to you about the method you described to me a month ago to
 define the level order during a read.table call. I initially thought that I
 would need to apply the 'unique' function on a single column of my dataset,
 so I only used it after the read.table step (to make my life easier)...
 Well, I was wrong: I need to reorder all my columns (just to remind you, I
 don't know the numbers of columns my code has to handle). So, here come
 troubles.

 I first tried to apply your code as is, although I thought there might be
 some problems. The class can actually not be recycled, when a list notation
 is used (the help says that colClasses character. A vector of classes to be
 assumed for the columns. Recycled as necessary...). See the following
 example:

 ##

 library(methods)

 setClass(my.factor)

 setAs(character, my.factor,

  function(from) factor(from, levels = unique(from)))



 Input-a b c d

 1 1 175 n f

 2 2 102 n j

 3 3 187 o n

 4 4 106 u g

 5 5 102 o v

 6 6 133 l x

 7 7 149 w q

 8 8 122 x p

 9 9 151 u r

 10 10 134 e g

 11 11 170 j q

 12 12 103 v n

 13 13 153 n w

 14 14 106 x x

 15 15 185 v x

 16 16 102 s p

 17 17 181 i h

 18 18 192 o k

 19 19 161 d f

 20 20 158 n q

 



 DF - read.table(textConnection(Input), header = TRUE, colClasses =
 list(c=(my.factor)))
 levels(DF$c) # properly ordered


 levels(DF$d) # not reordered

 ##

 I also tried that:

 ##

 DF - read.table(textConnection(Input), header = TRUE, colClasses =
 c(my.factor))
 levels(DF$c)

 levels(DF$d)

 ##

 In this case, the class is definitely recycled as all the columns of DF are
 transformed into factors... Not really useful :)
 I tried to modify the content of the list or my second notation, by
 including integer or a second my.factor... but I did not have much
 success.
 Any idea how to use the class my.factor multiple times ?

 Thanks in advance


 Gabor Grothendieck a écrit :
 Its the same principle. Just change the function to be suitable. This
 one
arranges the levels according to the
 input:

library(methods)
setClass(my.factor)
setAs(character,
 my.factor,
 function(from) factor(from, levels = unique(from)))

Input -
 a b c
1 1 176 w
2 2 141 k
3 3 172 r
4 4 182 s
5 5 123 k
6 6 153 p
7 7 176
 l
8 8 170 u
9 9 140 z
10 10 194 s
11 11 164 j
12 12 100 j
13 13 127 x
14 14
 137 r
15 15 198 d
16 16 173 j
17 17 113 x
18 18 144 w
19 19 198 q
20 20 122
 f

DF - read.table(textConnection(Input), header = TRUE,
 colClasses =
 list(c = my.factor))
str(DF)


On 8/28/07, Sébastien [EMAIL PROTECTED]
 wrote:

 Ok, I cannot send to you one of my dataset since they are confidential.
 But
I can produce a dummy mini dataset to illustrate my question. Let's
 say I
have a csv file with 3 columns and 20 rows which content is reproduced
 by
the following line.


 mydata-data.frame(a=1:20,

 b=sample(100:200,20,replace=T),c=sample(letters[1:26],
 20,
replace = T))

 mydata

  a b c
1 1 176 w
2 2 141 k
3 3 172 r
4 4 182 s
5 5 123 k
6 6 153 p
7 7 176
 l
8 8 170 u
9 9 140 z
10 10 194 s
11 11 164 j
12 12 100 j
13 13 127 x
14 14
 137 r
15 15 198 d
16 16 173 j
17 17 113 x
18 18 144 w
19 19 198 q
20 20 122
 f

If I had to read the csv file, I would use something
 like:
mydata-data.frame(read.table(file=c:/test.csv,header=T))

Now, if
 you look at mydata$c, the levels are alphabetically ordered.

 mydata$c

  [1] w k r s k p l u z s j j x r d j x w q f
Levels: d f j k l p q r s u w x
 z

What I am trying to do is to reorder the levels as to have them in the
 order
they appear in the table, ie
Levels: w k r s p l u z j x d q f

Again,
 keep in mind that my script should be used on datasets which content
are
 unknown to me. In my example, I have used letters for mydata$c, but my
code
 may have to handle factors of numeric or character values (I need
 to
transform specific columns of my dataset into factors for
 plotting
purposes). My goal is to let the code scan the content of each
 factor of my
data.frame during or after the read.table step and reorder
 their levels
automatically without having to ask the user to hard-code the
 level order.

In a way, my problem is more related to the way the factor
 levels are
ordered than to the read.table function, although I guess there
 is a link...

Gabor Grothendieck a écrit :
Its not clear from your
 description what you want.

 Could you be a bit more

 specific including an example.

 On 8/28/07, Sébastien [EMAIL PROTECTED]

 wrote:


 Thanks Gabor, I have two questions:

 1- Is there any difference