Re: [R] Lahman Baseball Data Using R DBI Package
Hello, Às 17:26 de 08/10/20, Bill Dunlap escreveu: This is really a feature of SQL, not R. SQL requires that you double quote column names that start with numbers, include spaces, etc., or that are SQL key words. Right, but there's no need to escape the double quotes, just put the SQL statement between single quotes and it becomes more readable. sqldf::sqldf('select "Order","Where","From" from d WHERE "From"="me"') Hope this helps, Rui Barradas E.g., d <- data.frame(Order=c("sit","stay","heel"), Where=c("here","there","there"), From=c("me","me","you")) sqldf::sqldf("select Order,Where,From from d WHERE From=\"me\"") Error: near "Order": syntax error sqldf::sqldf("select \"Order\",\"Where\",\"From\" from d Where \"From\"=\"me\"") Order Where From 1 sit here me 2 stay there me You may as well double quote all column names in SQL queries. -Bill On Wed, Oct 7, 2020 at 9:57 PM William Michels wrote: Hi Philip, You've probably realized by now that R doesn't like column names that start with a number. If you try to access an R-dataframe column named 2B or 3B with the familiar "$" notation, you'll get an error: library(DBI) library(RSQLite) con2 <- dbConnect(SQLite(), "~/R_Dir/lahmansbaseballdb.sqlite") Hack12Batting <- dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600 ORDER BY AB DESC") Hack12Batting$AB [1] 664 661 639 632 632 632 626 623 620 618 617 613 606 605 602 Hack12Batting$3B Error: unexpected numeric constant in "Hack12Batting$3" How to handle? You can rename columns on-the-fly by piping. See reference [1] and use either library(magrittr) or library(dplyr) or a combination thereof: library(magrittr) dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600 ORDER BY AB DESC") %>% set_colnames(make.names(colnames(.))) #OR one of the following: library(dplyr) dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600 ORDER BY AB DESC") %>% rename(X2B = `2B`, X3B = `3B`) library(dplyr) dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600 ORDER BY AB DESC") %>% `colnames<-`(make.names(colnames(.))) library(dplyr) dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600 ORDER BY AB DESC") %>% magrittr::set_colnames(make.names(colnames(.))) Best, Bill. W. Michels, Ph.D. [1] https://stackoverflow.com/questions/28100780/use-with-replacement-functions-like-colnames On Fri, Oct 2, 2020 at 7:34 PM Bill Dunlap wrote: The double quotes are required by SQL if a name is not of the form letter-followed-by-any-number-of-letters-or-numbers or if the name is a SQL keyword like 'where' or 'select'. If you are doing this from a function, you may as well quote all the names. -Bill On Fri, Oct 2, 2020 at 6:18 PM Philip wrote: The \”2B\” worked. Have no idea why. Can you point me somewhere that can explain this to me. Thanks, Philip *From:* Bill Dunlap *Sent:* Friday, October 2, 2020 3:54 PM *To:* Philip *Cc:* r-help *Subject:* Re: [R] Lahman Baseball Data Using R DBI Package Have you tried putting double quotes around 2B and 3B: "...2B, 3B, ..." -> "...\"2B\",\"3B\",..."? -Bill On Fri, Oct 2, 2020 at 3:49 PM Philip wrote: I’m trying to pull data from one table (batting) in the Lahman Baseball database. Notice X2B for doubles and X3B for triples – fourth and fifth from the right. The dbGetQuery function runs fine when I leave there two out but I get error messages (in red) when I include 2B/3B or X2B/X3B. Can anyone give me some direction? Thanks, Philip Heinrich *** tail(dbReadTable(Lahman,"batting")) ID playerID yearIDstint teamID team_ID lgID GG_batting AB R H X2BX3B HR RBI SB 107414 107414 yastrmi01 2019 1 SFN 2920 NL 107NA 371 64 101 22 3 21 552 107416 107416 yelicch01 20191 MIL 2911 NL 130NA 489 100 161 29 3 44 97 30 107419 107419 youngal01 2019 1 ARI2896 NL 17NA25 1 10 0 0 0 0 107420 107420 zagunma01 20191 CHN 2901 NL 30NA 36 2 93 0 0 5 0 107422 107422 zavalse01 20191 CHA 2900 AL5NA 12 1
Re: [R] Lahman Baseball Data Using R DBI Package
This is really a feature of SQL, not R. SQL requires that you double quote column names that start with numbers, include spaces, etc., or that are SQL key words. E.g., > d <- data.frame(Order=c("sit","stay","heel"), Where=c("here","there","there"), From=c("me","me","you")) > sqldf::sqldf("select Order,Where,From from d WHERE From=\"me\"") Error: near "Order": syntax error > sqldf::sqldf("select \"Order\",\"Where\",\"From\" from d Where \"From\"=\"me\"") Order Where From 1 sit here me 2 stay there me You may as well double quote all column names in SQL queries. -Bill On Wed, Oct 7, 2020 at 9:57 PM William Michels wrote: > Hi Philip, > > You've probably realized by now that R doesn't like column names that > start with a number. If you try to access an R-dataframe column named > 2B or 3B with the familiar "$" notation, you'll get an error: > > > library(DBI) > > library(RSQLite) > > con2 <- dbConnect(SQLite(), "~/R_Dir/lahmansbaseballdb.sqlite") > > Hack12Batting <- dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = > 2018 AND AB >600 ORDER BY AB DESC") > > Hack12Batting$AB > [1] 664 661 639 632 632 632 626 623 620 618 617 613 606 605 602 > > Hack12Batting$3B > Error: unexpected numeric constant in "Hack12Batting$3" > > How to handle? You can rename columns on-the-fly by piping. See > reference [1] and use either library(magrittr) or library(dplyr) or a > combination thereof: > > library(magrittr) > dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600 > ORDER BY AB DESC") %>% set_colnames(make.names(colnames(.))) > > #OR one of the following: > > library(dplyr) > dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600 > ORDER BY AB DESC") %>% rename(X2B = `2B`, X3B = `3B`) > > library(dplyr) > dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600 > ORDER BY AB DESC") %>% `colnames<-`(make.names(colnames(.))) > > library(dplyr) > dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600 > ORDER BY AB DESC") %>% magrittr::set_colnames(make.names(colnames(.))) > > Best, Bill. > > W. Michels, Ph.D. > > [1] > https://stackoverflow.com/questions/28100780/use-with-replacement-functions-like-colnames > > > > > > > > > > > On Fri, Oct 2, 2020 at 7:34 PM Bill Dunlap > wrote: > > > > The double quotes are required by SQL if a name is not of the form > > letter-followed-by-any-number-of-letters-or-numbers or if the name is a > SQL > > keyword like 'where' or 'select'. If you are doing this from a function, > > you may as well quote all the names. > > > > -Bill > > > > On Fri, Oct 2, 2020 at 6:18 PM Philip wrote: > > > > > The \”2B\” worked. Have no idea why. Can you point me somewhere that > can > > > explain this to me. > > > > > > Thanks, > > > Philip > > > > > > *From:* Bill Dunlap > > > *Sent:* Friday, October 2, 2020 3:54 PM > > > *To:* Philip > > > *Cc:* r-help > > > *Subject:* Re: [R] Lahman Baseball Data Using R DBI Package > > > > > > Have you tried putting double quotes around 2B and 3B: "...2B, 3B, > ..." > > > -> "...\"2B\",\"3B\",..."? > > > > > > -Bill > > > > > > On Fri, Oct 2, 2020 at 3:49 PM Philip wrote: > > > > > >> I’m trying to pull data from one table (batting) in the Lahman > Baseball > > >> database. Notice X2B for doubles and X3B for triples – fourth and > fifth > > >> from the right. > > >> > > >> The dbGetQuery function runs fine when I leave there two out but I get > > >> error messages (in red) when I include 2B/3B or X2B/X3B. > > >> > > >> Can anyone give me some direction? > > >> > > >> Thanks, > > >> Philip Heinrich > > >> > > >> > *** > > >> tail(dbReadTable(Lahman,"batting")) > > >> > > >> ID playerID yearIDstint teamID team_ID > > >> lgID GG_batting AB R H X2BX3B HR > RBI SB > > >> 107
Re: [R] Lahman Baseball Data Using R DBI Package
Hi Philip, You've probably realized by now that R doesn't like column names that start with a number. If you try to access an R-dataframe column named 2B or 3B with the familiar "$" notation, you'll get an error: > library(DBI) > library(RSQLite) > con2 <- dbConnect(SQLite(), "~/R_Dir/lahmansbaseballdb.sqlite") > Hack12Batting <- dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 > AND AB >600 ORDER BY AB DESC") > Hack12Batting$AB [1] 664 661 639 632 632 632 626 623 620 618 617 613 606 605 602 > Hack12Batting$3B Error: unexpected numeric constant in "Hack12Batting$3" How to handle? You can rename columns on-the-fly by piping. See reference [1] and use either library(magrittr) or library(dplyr) or a combination thereof: library(magrittr) dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600 ORDER BY AB DESC") %>% set_colnames(make.names(colnames(.))) #OR one of the following: library(dplyr) dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600 ORDER BY AB DESC") %>% rename(X2B = `2B`, X3B = `3B`) library(dplyr) dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600 ORDER BY AB DESC") %>% `colnames<-`(make.names(colnames(.))) library(dplyr) dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600 ORDER BY AB DESC") %>% magrittr::set_colnames(make.names(colnames(.))) Best, Bill. W. Michels, Ph.D. [1] https://stackoverflow.com/questions/28100780/use-with-replacement-functions-like-colnames On Fri, Oct 2, 2020 at 7:34 PM Bill Dunlap wrote: > > The double quotes are required by SQL if a name is not of the form > letter-followed-by-any-number-of-letters-or-numbers or if the name is a SQL > keyword like 'where' or 'select'. If you are doing this from a function, > you may as well quote all the names. > > -Bill > > On Fri, Oct 2, 2020 at 6:18 PM Philip wrote: > > > The \”2B\” worked. Have no idea why. Can you point me somewhere that can > > explain this to me. > > > > Thanks, > > Philip > > > > *From:* Bill Dunlap > > *Sent:* Friday, October 2, 2020 3:54 PM > > *To:* Philip > > *Cc:* r-help > > *Subject:* Re: [R] Lahman Baseball Data Using R DBI Package > > > > Have you tried putting double quotes around 2B and 3B: "...2B, 3B, ..." > > -> "...\"2B\",\"3B\",..."? > > > > -Bill > > > > On Fri, Oct 2, 2020 at 3:49 PM Philip wrote: > > > >> I’m trying to pull data from one table (batting) in the Lahman Baseball > >> database. Notice X2B for doubles and X3B for triples – fourth and fifth > >> from the right. > >> > >> The dbGetQuery function runs fine when I leave there two out but I get > >> error messages (in red) when I include 2B/3B or X2B/X3B. > >> > >> Can anyone give me some direction? > >> > >> Thanks, > >> Philip Heinrich > >> > >> *** > >> tail(dbReadTable(Lahman,"batting")) > >> > >> ID playerID yearIDstint teamID team_ID > >> lgID GG_batting AB R H X2BX3B HR RBI SB > >> 107414 107414 yastrmi01 2019 1 SFN 2920 > >> NL 107NA 371 64 101 22 3 21 > >> 552 > >> 107416 107416 yelicch01 20191 MIL 2911 > >> NL 130NA 489 100 161 29 3 4497 > >> 30 > >> 107419 107419 youngal01 2019 1 ARI2896 > >> NL 17NA25 1 10 0 > >> 0 0 0 > >> 107420 107420 zagunma01 20191 CHN 2901 NL > >> 30NA 36 2 93 0 0 > >> 5 0 > >> 107422 107422 zavalse01 20191 CHA 2900 > >> AL5NA 12 1 10 0 > >> 0 0 0 > >> 107427 107427 zimmery01 20191 WAS 2925 NL > >> 52NA 171 20449 0 6 27 0 > >> 107428 107428 zobribe01 20191 CHN 2901 > >> NL 47NA 150 24 39 5 0 1 > >> 17 0 > >> 107429 107429 zuninmi01 20191 TBA 2922 > >> AL 90
Re: [R] Lahman Baseball Data Using R DBI Package
The double quotes are required by SQL if a name is not of the form letter-followed-by-any-number-of-letters-or-numbers or if the name is a SQL keyword like 'where' or 'select'. If you are doing this from a function, you may as well quote all the names. -Bill On Fri, Oct 2, 2020 at 6:18 PM Philip wrote: > The \”2B\” worked. Have no idea why. Can you point me somewhere that can > explain this to me. > > Thanks, > Philip > > *From:* Bill Dunlap > *Sent:* Friday, October 2, 2020 3:54 PM > *To:* Philip > *Cc:* r-help > *Subject:* Re: [R] Lahman Baseball Data Using R DBI Package > > Have you tried putting double quotes around 2B and 3B: "...2B, 3B, ..." > -> "...\"2B\",\"3B\",..."? > > -Bill > > On Fri, Oct 2, 2020 at 3:49 PM Philip wrote: > >> I’m trying to pull data from one table (batting) in the Lahman Baseball >> database. Notice X2B for doubles and X3B for triples – fourth and fifth >> from the right. >> >> The dbGetQuery function runs fine when I leave there two out but I get >> error messages (in red) when I include 2B/3B or X2B/X3B. >> >> Can anyone give me some direction? >> >> Thanks, >> Philip Heinrich >> >> *** >> tail(dbReadTable(Lahman,"batting")) >> >> ID playerID yearIDstint teamID team_ID >> lgID GG_batting AB R H X2BX3B HR RBI SB >> 107414 107414 yastrmi01 2019 1 SFN 2920 >> NL 107NA 371 64 101 22 3 21 >> 552 >> 107416 107416 yelicch01 20191 MIL 2911 >> NL 130NA 489 100 161 29 3 4497 30 >> 107419 107419 youngal01 2019 1 ARI2896 >> NL 17NA25 1 10 0 >> 0 0 0 >> 107420 107420 zagunma01 20191 CHN 2901 NL >> 30NA 36 2 93 0 0 >> 5 0 >> 107422 107422 zavalse01 20191 CHA 2900 >> AL5NA 12 1 10 0 >> 0 0 0 >> 107427 107427 zimmery01 20191 WAS 2925 NL >> 52NA 171 20449 0 6 27 0 >> 107428 107428 zobribe01 20191 CHN 2901 >> NL 47NA 150 24 39 5 0 1 >> 17 0 >> 107429 107429 zuninmi01 20191 TBA 2922 >> AL 90NA 26630 44 10 1 9 >> 32 0 >> >> >> Hack11Batting <- dbGetQuery(Lahman,"SELECT >> playerID,yearID,AB,R,H,2B,3B,HR, >> RBI,SB,CS,BB,SO,IBB,HBP,SH,SF,GIDP FROM >> batting >> WHERE yearID = 2018 AND AB >99") >> Error: unrecognized token: "2B" >> >> Hack11Batting <- dbGetQuery(Lahman,"SELECT >> playerID,yearID,AB,R,H,X2B,X3B,HR, >> RBI,SB,CS,BB,SO,IBB,HBP,SH,SF,GIDP FROM >> batting >> WHERE yearID = 2018 AND AB >99") >> Error: no such column: X2B >> >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lahman Baseball Data Using R DBI Package
Have you tried putting double quotes around 2B and 3B: "...2B, 3B, ..." -> "...\"2B\",\"3B\",..."? -Bill On Fri, Oct 2, 2020 at 3:49 PM Philip wrote: > I’m trying to pull data from one table (batting) in the Lahman Baseball > database. Notice X2B for doubles and X3B for triples – fourth and fifth > from the right. > > The dbGetQuery function runs fine when I leave there two out but I get > error messages (in red) when I include 2B/3B or X2B/X3B. > > Can anyone give me some direction? > > Thanks, > Philip Heinrich > > *** > tail(dbReadTable(Lahman,"batting")) > > ID playerID yearIDstint teamID team_ID > lgID GG_batting AB R H X2BX3B HR RBI SB > 107414 107414 yastrmi01 2019 1 SFN 2920 NL > 107NA 371 64 101 22 3 21 552 > 107416 107416 yelicch01 20191 MIL 2911 NL > 130NA 489 100 161 29 3 4497 30 > 107419 107419 youngal01 2019 1 ARI2896 NL >17NA25 1 10 0 0 0 >0 > 107420 107420 zagunma01 20191 CHN 2901 NL > 30NA 36 2 93 0 0 5 > 0 > 107422 107422 zavalse01 20191 CHA 2900 AL > 5NA 12 1 10 0 0 > 0 0 > 107427 107427 zimmery01 20191 WAS 2925 NL > 52NA 171 20449 0 6 27 0 > 107428 107428 zobribe01 20191 CHN 2901 NL > 47NA 150 24 39 5 0 1 17 0 > 107429 107429 zuninmi01 20191 TBA 2922 AL > 90NA 26630 44 10 1 9 32 0 > > > Hack11Batting <- dbGetQuery(Lahman,"SELECT playerID,yearID,AB,R,H,2B,3B,HR, > RBI,SB,CS,BB,SO,IBB,HBP,SH,SF,GIDP FROM batting > WHERE yearID = 2018 AND AB >99") > Error: unrecognized token: "2B" > > Hack11Batting <- dbGetQuery(Lahman,"SELECT > playerID,yearID,AB,R,H,X2B,X3B,HR, > RBI,SB,CS,BB,SO,IBB,HBP,SH,SF,GIDP FROM batting > WHERE yearID = 2018 AND AB >99") > Error: no such column: X2B > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.