Another solution: CaseID <- c("1015285", "1005317", "1012281", "1015285", "1015285", "1007183", "1008833", "1015315", "1015322", "1015285") Primary.Viol.Type <- c("AS.Age", "HS.Hours", "HS.Hours", "HS.Hours", "RK.Records_CL", "OT.Overtime", "OT.Overtime", "OT.Overtime", "V.Poster_Other", "V.Poster_Other")
library(reshape2) dcast(data.frame(CaseID, Primary.Viol.Type), CaseID~Primary.Viol.Type, length) # result: Using Primary.Viol.Type as value column: use value.var to override. CaseID AS.Age HS.Hours OT.Overtime RK.Records_CL V.Poster_Other 1 1005317 0 1 0 0 0 2 1007183 0 0 1 0 0 3 1008833 0 0 1 0 0 4 1012281 0 1 0 0 0 5 1015285 1 1 0 1 1 6 1015315 0 0 1 0 0 7 1015322 0 0 0 0 1 best, s. On 19 December 2014 at 06:35, Chel Hee Lee <chl...@mail.usask.ca> wrote: > Please take a look at my code again. The error message says that object > 'Primary.Viol.Type' not found. Have you ever created the object > 'Primary.Viol.Type'? It will be working if you replace 'Primary.Viol.Type' > by 'PViol.Type.Per.Case.Original$Primary.Viol.Type' where 'factor()' is > used. I hope this helps. > > Chel Hee Lee > > On 12/18/2014 08:57 PM, Crombie, Burnette N wrote: >> >> Chel, your solution is fantastic on the dataset I submitted in my question >> but it is not working when I import my real dataset into R. Do I need to >> vectorize the columns in my real dataset after importing? I tried a few >> things (###) but not making progress: >> >> MERGE_PViol.Detail.Per.Case <- >> read.csv("~/FOIA_FLSA/MERGE_PViol.Detail.Per.Case_for_rtf10.csv", >> stringsAsFactors=TRUE) >> >> ### select only certain columns >> PViol.Type.Per.Case.Original <- MERGE_PViol.Detail.Per.Case[,c("CaseID", >> "Primary.Viol.Type")] >> >> ### write.csv(PViol.Type.Per.Case,file="PViol.Type.Per.Case.Select.csv") >> ### PViol.Type.Per.Case.Original <- >> read.csv("~/FOIA_FLSA/PViol.Type.Per.Case.Select.csv") >> ### PViol.Type.Per.Case.Original$X <- NULL >> ###PViol.Type.Per.Case.Original[] <- lapply(PViol.Type.Per.Case.Original, >> as.character) >> >> PViol.Type <- c("CaseID", >> "BW.BackWages", >> "LD.Liquid_Damages", >> "MW.Minimum_Wage", >> "OT.Overtime", >> "RK.Records_FLSA", >> "V.Poster_Other", >> "AS.Age", >> "BW.WHMIS_BackWages", >> "HS.Hours", >> "OA.HazOccupationAg", >> "ON.HazOccupationNonAg", >> "R3.Reg3AgeOccupation", >> "RK.Records_CL", >> "V.Other") >> >> PViol.Type.Per.Case.Original$Primary.Viol.Type <- >> factor(Primary.Viol.Type, levels=PViol.Type, labels=PViol.Type) >> >> ### Error in factor(Primary.Viol.Type, levels = PViol.Type, labels = >> PViol.Type) : object 'Primary.Viol.Type' not found >> >> tmp <- >> split(PViol.Type.Per.Case.Original,PViol.Type.Per.Case.Original$CaseID) >> ans <- ifelse(do.call(rbind, lapply(tmp, >> function(x)table(x$Primary.Viol.Type))), 1, NA) >> >> >> >> -----Original Message----- >> From: Crombie, Burnette N >> Sent: Thursday, December 18, 2014 3:01 PM >> To: 'Chel Hee Lee' >> Subject: RE: [R] Make 2nd col of 2-col df into header row of same df then >> adjust col1 data display >> >> Thanks for taking the time to review this, Chel. I've got to step away >> from my desk, but will reply more substantially as soon as possible. -- BNC >> >> -----Original Message----- >> From: Chel Hee Lee [mailto:chl...@mail.usask.ca] >> Sent: Thursday, December 18, 2014 2:43 PM >> To: Jeff Newmiller; Crombie, Burnette N >> Cc: r-help@r-project.org >> Subject: Re: [R] Make 2nd col of 2-col df into header row of same df then >> adjust col1 data display >> >> I like the approach presented by Jeff Newmiller as shown in the previous >> post (I really like his way). As he suggested, it would be good to start >> with 'factor' since you have all values of 'Primary.Viol.Type'. >> You may try to use 'split()' function for creating table that you wish to >> build. Please see the below (I hope this helps): >> >> > PViol.Type.Per.Case.Original$Primary.Viol.Type <- >> factor(Primary.Viol.Type, levels=PViol.Type, labels=PViol.Type) > > tmp <- >> split(PViol.Type.Per.Case.Original, >> PViol.Type.Per.Case.Original$CaseID) >> > ans <- ifelse(do.call(rbind, lapply(tmp, function(x) >> table(x$Primary.Viol.Type))), 1, NA) > ans >> CaseID BW.BackWages LD.Liquid_Damages MW.Minimum_Wage >> OT.Overtime >> 1005317 NA NA NA NA NA >> 1007183 NA NA NA NA 1 >> 1008833 NA NA NA NA 1 >> 1012281 NA NA NA NA NA >> 1015285 NA NA NA NA NA >> 1015315 NA NA NA NA 1 >> 1015322 NA NA NA NA NA >> RK.Records_FLSA V.Poster_Other AS.Age BW.WHMIS_BackWages >> HS.Hours >> 1005317 NA NA NA NA 1 >> 1007183 NA NA NA NA NA >> 1008833 NA NA NA NA NA >> 1012281 NA NA NA NA 1 >> 1015285 NA 1 1 NA 1 >> 1015315 NA NA NA NA NA >> 1015322 NA 1 NA NA NA >> OA.HazOccupationAg ON.HazOccupationNonAg R3.Reg3AgeOccupation >> 1005317 NA NA NA >> 1007183 NA NA NA >> 1008833 NA NA NA >> 1012281 NA NA NA >> 1015285 NA NA NA >> 1015315 NA NA NA >> 1015322 NA NA NA >> RK.Records_CL V.Other >> 1005317 NA NA >> 1007183 NA NA >> 1008833 NA NA >> 1012281 NA NA >> 1015285 1 NA >> 1015315 NA NA >> 1015322 NA NA >> > >> >> Chel Hee Lee >> >> On 12/18/2014 10:02 AM, Jeff Newmiller wrote: >>> >>> No guarantees on "best"... but one way using base R could be: >>> >>> # Note that "CaseID" is actually not a valid PViol.Type as you had it >>> PViol.Type <- c( "BW.BackWages" >>> , "LD.Liquid_Damages" >>> , "MW.Minimum_Wage" >>> , "OT.Overtime" >>> , "RK.Records_FLSA" >>> , "V.Poster_Other" >>> , "AS.Age" >>> , "BW.WHMIS_BackWages" >>> , "HS.Hours" >>> , "OA.HazOccupationAg" >>> , "ON.HazOccupationNonAg" >>> , "R3.Reg3AgeOccupation" >>> , "RK.Records_CL" >>> , "V.Other" ) >>> >>> # explicitly specifying all levels to the factor insures a complete # >>> set of column outputs regardless of what is in the input >>> PViol.Type.Per.Case.Original <- >>> data.frame( CaseID >>> , Primary.Viol.Type=factor( Primary.Viol.Type >>> , levels=PViol.Type ) ) >>> >>> tmp <- table( PViol.Type.Per.Case.Original ) ans <- data.frame( >>> CaseID=rownames( tmp ) >>> , as.data.frame( ifelse( 0==tmp, NA, 1 ) ) >>> ) >>> >>> >>> On Wed, 17 Dec 2014, bcrombie wrote: >>> >>>> # I have a dataframe that contains 2 columns: >>>> CaseID <- c('1015285', >>>> '1005317', >>>> '1012281', >>>> '1015285', >>>> '1015285', >>>> '1007183', >>>> '1008833', >>>> '1015315', >>>> '1015322', >>>> '1015285') >>>> >>>> Primary.Viol.Type <- c('AS.Age', >>>> 'HS.Hours', >>>> 'HS.Hours', >>>> 'HS.Hours', >>>> 'RK.Records_CL', >>>> 'OT.Overtime', >>>> 'OT.Overtime', >>>> 'OT.Overtime', >>>> 'V.Poster_Other', >>>> 'V.Poster_Other') >>>> >>>> PViol.Type.Per.Case.Original <- data.frame(CaseID,Primary.Viol.Type) >>>> >>>> # CaseID?s can be repeated because there can be up to 14 >>>> Primary.Viol.Type?s per CaseID. >>>> >>>> # I want to transform this dataframe into one that has 15 columns, >>>> where the first column is CaseID, and the rest are the 14 primary >>>> viol. types. The CaseID column will contain a list of the unique >>>> CaseID?s (no >>>> replicates) and >>>> for each of their rows, there will be a ?1? under a column >>>> corresponding to a primary violation type recorded for that CaseID. >>>> So, technically, there could be zero to 14 ?1?s? in a CaseID?s row. >>>> >>>> # For example, the row for CaseID '1015285' above would have a ?1? >>>> under ?AS.Age?, ?HS.Hours?, ?RK.Records_CL?, and ?V.Poster_Other?, >>>> but have "NA" >>>> under the rest of the columns. >>>> >>>> PViol.Type <- c("CaseID", >>>> "BW.BackWages", >>>> "LD.Liquid_Damages", >>>> "MW.Minimum_Wage", >>>> "OT.Overtime", >>>> "RK.Records_FLSA", >>>> "V.Poster_Other", >>>> "AS.Age", >>>> "BW.WHMIS_BackWages", >>>> "HS.Hours", >>>> "OA.HazOccupationAg", >>>> "ON.HazOccupationNonAg", >>>> "R3.Reg3AgeOccupation", >>>> "RK.Records_CL", >>>> "V.Other") >>>> >>>> PViol.Type.Columns <- t(data.frame(PViol.Type) >>>> >>>> # What is the best way to do this in R? >>>> >>>> >>>> >>>> >>>> -- >>>> View this message in context: >>>> http://r.789695.n4.nabble.com/Make-2nd-col-of-2-col-df-into-header-ro >>>> w-of-same-df-then-adjust-col1-data-display-tp4700878.html >>>> >>>> Sent from the R help mailing list archive at Nabble.com. >>>> >>>> ______________________________________________ >>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >>> >>> --------------------------------------------------------------------------- >>> Jeff Newmiller The ..... ..... Go >>> Live... >>> DCN:<jdnew...@dcn.davis.ca.us> Basics: ##.#. ##.#. Live >>> Go... >>> Live: OO#.. Dead: OO#.. >>> Playing >>> Research Engineer (Solar/Batteries O.O#. #.O#. with >>> /Software/Embedded Controllers) .OO#. .OO#. >>> rocks...1k >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. > > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.