Can you show the output of dput(x_data) and dput(y_data). On Fri, Mar 12, 2010 at 11:56 AM, Newbie19_02 <nvanzuy...@gmail.com> wrote: > > Dear R users, > > I have two data frames that were read from text files as follows: > > x_data <- read.table("x.txt", header = TRUE, sep = "|", quote = "\"'", > dec = ".",as.is = TRUE,na.strings = "NA",colClasses = NA, > nrows = 3864284, > skip = 0, check.names = TRUE,fill=TRUE, > strip.white = TRUE, blank.lines.skip = TRUE, > comment.char = "#", allowEscapes = FALSE, flush = FALSE, > fileEncoding = "", encoding = "unknown") > > x_data > > prochi prescribed_date dataMonth item_code res_seqno quantity directions > CAO0000713 22/06/2001 NULL 842752 NULL 60 1/D > CAO0000713 28/04/2000 NULL 7800 NULL 100G A/TD > CAO0000713 10/04/2000 NULL 842652 NULL 60 1/D > CAO0000713 03/07/2000 NULL 842652 NULL 60 1/D > CAO0000713 09/01/2001 NULL 842752 NULL 60 1/D > CAO0000713 16/10/2001 NULL 842752 NULL 60 1/D > CAO0000713 16/08/2001 NULL 842752 NULL 60 1/D > CAO0000713 17/09/1993 NULL 39620 NULL 5ML NIL > CAO0000713 01/05/2001 NULL 842752 NULL 60 1/D > CAO0000713 05/03/2001 NULL 842752 NULL 60 1/D > > > > y_data > > item_code name formulation_code strength > bnf_code > 100 NEONACLEX K TABS NULL 2.2.8 > 110 NEONACLEX TABS 5MG 2.2.1 > 50 MESORB DRESS 10CMX10CM 20.3.1 > 160 ABSORBENT CELLULOSE MESO DRESS 10CMX10CM 20.3.1 > 161 ABSORBENT CELLULOSE MESO DRESS 10CMX15CM 20.3.1 > 164 ABSORBENT CELLULOSE MESO DRESS 20CMX25CM 20.3.1 > 200 SEPTRIN TABS 480MG 5.1.8 > 210 SEPTRIN PAED SF SUSP 240MG/5ML 5.1.8 > 212 SEPTRIN ADULT SUSP 480MG/5ML 5.1.8 > 220 SEPTRIN FORTE TABS 960MG 5.1.8 > etc.... > > > contains all the information for the item codes > y was read in in the same way. > > I then used the following code: > > z <- sqldf("select * from x left join y using (code)") > > when I use this on my real data I get an output: > prochi prescribed_date dataMonth item_code res_seqno quantity directions > 1 CAO0000713 22/06/2001 NULL 842752 NULL 60 > 1/D > 2 CAO0000713 28/04/2000 NULL 7800 NULL 100G > A/TD > 3 CAO0000713 10/04/2000 NULL 842652 NULL 60 > 1/D > 4 CAO0000713 03/07/2000 NULL 842652 NULL 60 > 1/D > 5 CAO0000713 09/01/2001 NULL 842752 NULL 60 > 1/D > 6 CAO0000713 16/10/2001 NULL 842752 NULL 60 > 1/D > 7 CAO0000713 16/08/2001 NULL 842752 NULL 60 > 1/D > 8 CAO0000713 17/09/1993 NULL 39620 NULL 5ML > NIL > 9 CAO0000713 01/05/2001 NULL 842752 NULL 60 > 1/D > 10 CAO0000713 05/03/2001 NULL 842752 NULL 60 > 1/D > no_of_packs datasource scan_ref_no name formulation_code strength > 1 NULL TSF NULL <NA> <NA> <NA> > 2 NULL TSF NULL BETNOVATE RD OINT 0.025% > 3 NULL TSF NULL <NA> <NA> <NA> > 4 NULL TSF NULL <NA> <NA> <NA> > 5 NULL TSF NULL <NA> <NA> <NA> > 6 NULL TSF NULL <NA> <NA> <NA> > 7 NULL TSF NULL <NA> <NA> <NA> > 8 NULL TSF NULL GAMMABULIN INJ 320MG > 9 NULL TSF NULL <NA> <NA> <NA> > 10 NULL TSF NULL <NA> <NA> <NA> > bnf_code > 1 <NA> > 2 13.4.1.2 > 3 <NA> > 4 <NA> > 5 <NA> > 6 <NA> > 7 <NA> > 8 14.5 > 9 <NA> > 10 <NA> > > > There is absolutely no reason for there to be <NA> anywhere as the > information for both the tables is complete. > > Not sure what the problem is? > > Thanks, > Natalie > -- > View this message in context: > http://n4.nabble.com/sqldf-not-joining-all-the-fields-tp1590786p1590786.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.