Re: [R] extracting pdf tables...
Dear Jeff, Got the answer...I changed output = "matrix" in extract_table and used as.data.frame to coerce the output matrix to a data frame and rbind is working fine then... Thanks anyways for your reply... THanking you, Yours sincerely, AKSHAY M KULKARNI From: Jeff Newmiller Sent: Monday, April 10, 2023 12:53 AM To: akshay kulkarni ; r-help@r-project.org Subject: Re: [R] extracting pdf tables... I don't know... I have never used tabulizer (which is no longer on CRAN anyway). In general you would provide an argument to the data import function that would tell it to expect a header. I suspect you will have to set the names(IDT[[4]]) <- whatever it should be and remove the first row from the data frame. On April 9, 2023 12:03:36 PM PDT, akshay kulkarni wrote: >Dear Jeff, > Thanks for your reply. > >I have the following: > >> colnames(IDT[[4]]) >[1] "X168""TATA.MOTORS.LIMITED" "TATAMOTORS" "X4" > >THe above has to be the first row of IDT[[4]]. The first row is getting parsed >as the column name. How do you make that the first row of IDT[[4]]? > >Thanking you, >Yours sincerely, >AKSHAY M KULKARNI > >From: Jeff Newmiller >Sent: Monday, April 10, 2023 12:27 AM >To: akshay kulkarni ; r-help@r-project.org > >Subject: Re: [R] extracting pdf tables... > >Your code used cbind. My first answer was appropriate for rbind. > >So you still need to figure out how to deal with the different columns in the >tables, which requires more knowledge about their contents than we have. > >On April 9, 2023 11:43:01 AM PDT, akshay kulkarni >wrote: >>Dear Jeff, >> I want to rbind. >> >>Thanking you, >>Yours sincerely, >>AKSHAY M KULKARNI >>____ >>From: R-help on behalf of Jeff Newmiller >> >>Sent: Sunday, April 9, 2023 11:57 PM >>To: r-help@r-project.org >>Subject: Re: [R] extracting pdf tables... >> >>Sorry, did not read closely enough. >> >>Did you want rbind (which has no problem with different numbers of rows) or >>merge (which requires that there be key columns that can be aligned by >>repeating data)? >> >>On April 9, 2023 10:49:09 AM PDT, Jeff Newmiller >>wrote: >>>Clearly the column names are different. You need to decide what to do about >>>that. Choose the subset of dataframes where the column names are the same? >>>Rename columns? Omit some columns? Add missing columns filled with NA? >>> >>>On April 9, 2023 10:22:32 AM PDT, akshay kulkarni >>>wrote: >>>>Dear members, >>>> I am extracting a pdf table by the following >>>> code: >>>> >>>>> library(tabulizer) >>>>> IDT <- >>>>> extract_tables("https://www.canmoney.in/pdf/INTRADAYLEVERAGE-20220531-latest.pdf",output >>>>> = "data.frame") >>>> >>>>It returns 4 different data frames which I want to combine them and make >>>>one data frame. But when I run this: >>>> >>>>> rbind(IDT[[1]],IDT[[2]],IDT[[3]],IDT[[4]]) >>>> Error in match.names(clabs, names(xi)) : >>>>names do not match previous names >>>> >>>>Also: >>>> >>>>> class(IDT[[1]]) >>>>[1] "data.frame" >>>> >>>>> cbind(IDT[[1]],IDT[[2]],IDT[[3]],IDT[[4]],make.row.names = FALSE) >>>> Error in data.frame(..., check.names = FALSE) : >>>>arguments imply differing number of rows: 55, 56, 30, 1 >>>> >>>>Can anyone please help me to combine all these 4 different data frames? >>>> >>>>Thanking you, >>>>Yours sincerely, >>>>AKSHAY M KULKARNI >>>> >>>> [[alternative HTML version deleted]] >>>> >>>>__ >>>>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>https://stat.ethz.ch/mailman/listinfo/r-help >>>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>>and provide commented, minimal, self-contained, reproducible code. >>> >> >>-- >>Sent from my phone. Please excuse my brevity. >> >>__ >>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>https://stat.ethz.ch/mailman/listinfo/r-help >>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>and provide commented, minimal, self-contained, reproducible code. > >-- >Sent from my phone. Please excuse my brevity. -- Sent from my phone. Please excuse my brevity. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extracting pdf tables...
I don't know... I have never used tabulizer (which is no longer on CRAN anyway). In general you would provide an argument to the data import function that would tell it to expect a header. I suspect you will have to set the names(IDT[[4]]) <- whatever it should be and remove the first row from the data frame. On April 9, 2023 12:03:36 PM PDT, akshay kulkarni wrote: >Dear Jeff, > Thanks for your reply. > >I have the following: > >> colnames(IDT[[4]]) >[1] "X168""TATA.MOTORS.LIMITED" "TATAMOTORS" "X4" > >THe above has to be the first row of IDT[[4]]. The first row is getting parsed >as the column name. How do you make that the first row of IDT[[4]]? > >Thanking you, >Yours sincerely, >AKSHAY M KULKARNI > >From: Jeff Newmiller >Sent: Monday, April 10, 2023 12:27 AM >To: akshay kulkarni ; r-help@r-project.org > >Subject: Re: [R] extracting pdf tables... > >Your code used cbind. My first answer was appropriate for rbind. > >So you still need to figure out how to deal with the different columns in the >tables, which requires more knowledge about their contents than we have. > >On April 9, 2023 11:43:01 AM PDT, akshay kulkarni >wrote: >>Dear Jeff, >> I want to rbind. >> >>Thanking you, >>Yours sincerely, >>AKSHAY M KULKARNI >>________ >>From: R-help on behalf of Jeff Newmiller >> >>Sent: Sunday, April 9, 2023 11:57 PM >>To: r-help@r-project.org >>Subject: Re: [R] extracting pdf tables... >> >>Sorry, did not read closely enough. >> >>Did you want rbind (which has no problem with different numbers of rows) or >>merge (which requires that there be key columns that can be aligned by >>repeating data)? >> >>On April 9, 2023 10:49:09 AM PDT, Jeff Newmiller >>wrote: >>>Clearly the column names are different. You need to decide what to do about >>>that. Choose the subset of dataframes where the column names are the same? >>>Rename columns? Omit some columns? Add missing columns filled with NA? >>> >>>On April 9, 2023 10:22:32 AM PDT, akshay kulkarni >>>wrote: >>>>Dear members, >>>> I am extracting a pdf table by the following >>>> code: >>>> >>>>> library(tabulizer) >>>>> IDT <- >>>>> extract_tables("https://www.canmoney.in/pdf/INTRADAYLEVERAGE-20220531-latest.pdf",output >>>>> = "data.frame") >>>> >>>>It returns 4 different data frames which I want to combine them and make >>>>one data frame. But when I run this: >>>> >>>>> rbind(IDT[[1]],IDT[[2]],IDT[[3]],IDT[[4]]) >>>> Error in match.names(clabs, names(xi)) : >>>>names do not match previous names >>>> >>>>Also: >>>> >>>>> class(IDT[[1]]) >>>>[1] "data.frame" >>>> >>>>> cbind(IDT[[1]],IDT[[2]],IDT[[3]],IDT[[4]],make.row.names = FALSE) >>>> Error in data.frame(..., check.names = FALSE) : >>>>arguments imply differing number of rows: 55, 56, 30, 1 >>>> >>>>Can anyone please help me to combine all these 4 different data frames? >>>> >>>>Thanking you, >>>>Yours sincerely, >>>>AKSHAY M KULKARNI >>>> >>>> [[alternative HTML version deleted]] >>>> >>>>__ >>>>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>https://stat.ethz.ch/mailman/listinfo/r-help >>>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>>and provide commented, minimal, self-contained, reproducible code. >>> >> >>-- >>Sent from my phone. Please excuse my brevity. >> >>__ >>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>https://stat.ethz.ch/mailman/listinfo/r-help >>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>and provide commented, minimal, self-contained, reproducible code. > >-- >Sent from my phone. Please excuse my brevity. -- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extracting pdf tables...
Dear Jeff, Thanks for your reply. I have the following: > colnames(IDT[[4]]) [1] "X168""TATA.MOTORS.LIMITED" "TATAMOTORS" "X4" THe above has to be the first row of IDT[[4]]. The first row is getting parsed as the column name. How do you make that the first row of IDT[[4]]? Thanking you, Yours sincerely, AKSHAY M KULKARNI From: Jeff Newmiller Sent: Monday, April 10, 2023 12:27 AM To: akshay kulkarni ; r-help@r-project.org Subject: Re: [R] extracting pdf tables... Your code used cbind. My first answer was appropriate for rbind. So you still need to figure out how to deal with the different columns in the tables, which requires more knowledge about their contents than we have. On April 9, 2023 11:43:01 AM PDT, akshay kulkarni wrote: >Dear Jeff, > I want to rbind. > >Thanking you, >Yours sincerely, >AKSHAY M KULKARNI > >From: R-help on behalf of Jeff Newmiller > >Sent: Sunday, April 9, 2023 11:57 PM >To: r-help@r-project.org >Subject: Re: [R] extracting pdf tables... > >Sorry, did not read closely enough. > >Did you want rbind (which has no problem with different numbers of rows) or >merge (which requires that there be key columns that can be aligned by >repeating data)? > >On April 9, 2023 10:49:09 AM PDT, Jeff Newmiller >wrote: >>Clearly the column names are different. You need to decide what to do about >>that. Choose the subset of dataframes where the column names are the same? >>Rename columns? Omit some columns? Add missing columns filled with NA? >> >>On April 9, 2023 10:22:32 AM PDT, akshay kulkarni >>wrote: >>>Dear members, >>> I am extracting a pdf table by the following >>> code: >>> >>>> library(tabulizer) >>>> IDT <- >>>> extract_tables("https://www.canmoney.in/pdf/INTRADAYLEVERAGE-20220531-latest.pdf",output >>>> = "data.frame") >>> >>>It returns 4 different data frames which I want to combine them and make one >>>data frame. But when I run this: >>> >>>> rbind(IDT[[1]],IDT[[2]],IDT[[3]],IDT[[4]]) >>> Error in match.names(clabs, names(xi)) : >>>names do not match previous names >>> >>>Also: >>> >>>> class(IDT[[1]]) >>>[1] "data.frame" >>> >>>> cbind(IDT[[1]],IDT[[2]],IDT[[3]],IDT[[4]],make.row.names = FALSE) >>> Error in data.frame(..., check.names = FALSE) : >>>arguments imply differing number of rows: 55, 56, 30, 1 >>> >>>Can anyone please help me to combine all these 4 different data frames? >>> >>>Thanking you, >>>Yours sincerely, >>>AKSHAY M KULKARNI >>> >>> [[alternative HTML version deleted]] >>> >>>__ >>>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>https://stat.ethz.ch/mailman/listinfo/r-help >>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>and provide commented, minimal, self-contained, reproducible code. >> > >-- >Sent from my phone. Please excuse my brevity. > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. -- Sent from my phone. Please excuse my brevity. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extracting pdf tables...
Your code used cbind. My first answer was appropriate for rbind. So you still need to figure out how to deal with the different columns in the tables, which requires more knowledge about their contents than we have. On April 9, 2023 11:43:01 AM PDT, akshay kulkarni wrote: >Dear Jeff, > I want to rbind. > >Thanking you, >Yours sincerely, >AKSHAY M KULKARNI > >From: R-help on behalf of Jeff Newmiller > >Sent: Sunday, April 9, 2023 11:57 PM >To: r-help@r-project.org >Subject: Re: [R] extracting pdf tables... > >Sorry, did not read closely enough. > >Did you want rbind (which has no problem with different numbers of rows) or >merge (which requires that there be key columns that can be aligned by >repeating data)? > >On April 9, 2023 10:49:09 AM PDT, Jeff Newmiller >wrote: >>Clearly the column names are different. You need to decide what to do about >>that. Choose the subset of dataframes where the column names are the same? >>Rename columns? Omit some columns? Add missing columns filled with NA? >> >>On April 9, 2023 10:22:32 AM PDT, akshay kulkarni >>wrote: >>>Dear members, >>> I am extracting a pdf table by the following >>> code: >>> >>>> library(tabulizer) >>>> IDT <- >>>> extract_tables("https://www.canmoney.in/pdf/INTRADAYLEVERAGE-20220531-latest.pdf",output >>>> = "data.frame") >>> >>>It returns 4 different data frames which I want to combine them and make one >>>data frame. But when I run this: >>> >>>> rbind(IDT[[1]],IDT[[2]],IDT[[3]],IDT[[4]]) >>> Error in match.names(clabs, names(xi)) : >>>names do not match previous names >>> >>>Also: >>> >>>> class(IDT[[1]]) >>>[1] "data.frame" >>> >>>> cbind(IDT[[1]],IDT[[2]],IDT[[3]],IDT[[4]],make.row.names = FALSE) >>> Error in data.frame(..., check.names = FALSE) : >>>arguments imply differing number of rows: 55, 56, 30, 1 >>> >>>Can anyone please help me to combine all these 4 different data frames? >>> >>>Thanking you, >>>Yours sincerely, >>>AKSHAY M KULKARNI >>> >>> [[alternative HTML version deleted]] >>> >>>__ >>>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>https://stat.ethz.ch/mailman/listinfo/r-help >>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>and provide commented, minimal, self-contained, reproducible code. >> > >-- >Sent from my phone. Please excuse my brevity. > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. -- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extracting pdf tables...
Dear Jeff, I want to rbind. Thanking you, Yours sincerely, AKSHAY M KULKARNI From: R-help on behalf of Jeff Newmiller Sent: Sunday, April 9, 2023 11:57 PM To: r-help@r-project.org Subject: Re: [R] extracting pdf tables... Sorry, did not read closely enough. Did you want rbind (which has no problem with different numbers of rows) or merge (which requires that there be key columns that can be aligned by repeating data)? On April 9, 2023 10:49:09 AM PDT, Jeff Newmiller wrote: >Clearly the column names are different. You need to decide what to do about >that. Choose the subset of dataframes where the column names are the same? >Rename columns? Omit some columns? Add missing columns filled with NA? > >On April 9, 2023 10:22:32 AM PDT, akshay kulkarni >wrote: >>Dear members, >> I am extracting a pdf table by the following >> code: >> >>> library(tabulizer) >>> IDT <- >>> extract_tables("https://www.canmoney.in/pdf/INTRADAYLEVERAGE-20220531-latest.pdf",output >>> = "data.frame") >> >>It returns 4 different data frames which I want to combine them and make one >>data frame. But when I run this: >> >>> rbind(IDT[[1]],IDT[[2]],IDT[[3]],IDT[[4]]) >> Error in match.names(clabs, names(xi)) : >>names do not match previous names >> >>Also: >> >>> class(IDT[[1]]) >>[1] "data.frame" >> >>> cbind(IDT[[1]],IDT[[2]],IDT[[3]],IDT[[4]],make.row.names = FALSE) >> Error in data.frame(..., check.names = FALSE) : >>arguments imply differing number of rows: 55, 56, 30, 1 >> >>Can anyone please help me to combine all these 4 different data frames? >> >>Thanking you, >>Yours sincerely, >>AKSHAY M KULKARNI >> >> [[alternative HTML version deleted]] >> >>__ >>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>https://stat.ethz.ch/mailman/listinfo/r-help >>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>and provide commented, minimal, self-contained, reproducible code. > -- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extracting pdf tables...
Sorry, did not read closely enough. Did you want rbind (which has no problem with different numbers of rows) or merge (which requires that there be key columns that can be aligned by repeating data)? On April 9, 2023 10:49:09 AM PDT, Jeff Newmiller wrote: >Clearly the column names are different. You need to decide what to do about >that. Choose the subset of dataframes where the column names are the same? >Rename columns? Omit some columns? Add missing columns filled with NA? > >On April 9, 2023 10:22:32 AM PDT, akshay kulkarni >wrote: >>Dear members, >> I am extracting a pdf table by the following >> code: >> >>> library(tabulizer) >>> IDT <- >>> extract_tables("https://www.canmoney.in/pdf/INTRADAYLEVERAGE-20220531-latest.pdf",output >>> = "data.frame") >> >>It returns 4 different data frames which I want to combine them and make one >>data frame. But when I run this: >> >>> rbind(IDT[[1]],IDT[[2]],IDT[[3]],IDT[[4]]) >> Error in match.names(clabs, names(xi)) : >>names do not match previous names >> >>Also: >> >>> class(IDT[[1]]) >>[1] "data.frame" >> >>> cbind(IDT[[1]],IDT[[2]],IDT[[3]],IDT[[4]],make.row.names = FALSE) >> Error in data.frame(..., check.names = FALSE) : >>arguments imply differing number of rows: 55, 56, 30, 1 >> >>Can anyone please help me to combine all these 4 different data frames? >> >>Thanking you, >>Yours sincerely, >>AKSHAY M KULKARNI >> >> [[alternative HTML version deleted]] >> >>__ >>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>https://stat.ethz.ch/mailman/listinfo/r-help >>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>and provide commented, minimal, self-contained, reproducible code. > -- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extracting pdf tables...
Clearly the column names are different. You need to decide what to do about that. Choose the subset of dataframes where the column names are the same? Rename columns? Omit some columns? Add missing columns filled with NA? On April 9, 2023 10:22:32 AM PDT, akshay kulkarni wrote: >Dear members, > I am extracting a pdf table by the following code: > >> library(tabulizer) >> IDT <- >> extract_tables("https://www.canmoney.in/pdf/INTRADAYLEVERAGE-20220531-latest.pdf",output >> = "data.frame") > >It returns 4 different data frames which I want to combine them and make one >data frame. But when I run this: > >> rbind(IDT[[1]],IDT[[2]],IDT[[3]],IDT[[4]]) > Error in match.names(clabs, names(xi)) : >names do not match previous names > >Also: > >> class(IDT[[1]]) >[1] "data.frame" > >> cbind(IDT[[1]],IDT[[2]],IDT[[3]],IDT[[4]],make.row.names = FALSE) > Error in data.frame(..., check.names = FALSE) : >arguments imply differing number of rows: 55, 56, 30, 1 > >Can anyone please help me to combine all these 4 different data frames? > >Thanking you, >Yours sincerely, >AKSHAY M KULKARNI > > [[alternative HTML version deleted]] > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. -- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] extracting pdf tables...
Dear members, I am extracting a pdf table by the following code: > library(tabulizer) > IDT <- > extract_tables("https://www.canmoney.in/pdf/INTRADAYLEVERAGE-20220531-latest.pdf",output > = "data.frame") It returns 4 different data frames which I want to combine them and make one data frame. But when I run this: > rbind(IDT[[1]],IDT[[2]],IDT[[3]],IDT[[4]]) Error in match.names(clabs, names(xi)) : names do not match previous names Also: > class(IDT[[1]]) [1] "data.frame" > cbind(IDT[[1]],IDT[[2]],IDT[[3]],IDT[[4]],make.row.names = FALSE) Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows: 55, 56, 30, 1 Can anyone please help me to combine all these 4 different data frames? Thanking you, Yours sincerely, AKSHAY M KULKARNI [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.