[R] SQL query with Multicore option on R -linux
Hi all, I have the following sql query that I am executing on a machine with single core. I want to know how can I execute the same sqery on a maching that is running with 4 cores. Please provide me the code. NEW_TABLE - rhive.query(SELECT A, B, COUNT(C) FROM TABLE_A WHERE A='01-01-2012') Also let me know how can I leverage only 2 / 3 cores of the machine. Regards, Madana -- View this message in context: http://r.789695.n4.nabble.com/SQL-query-with-Multicore-option-on-R-linux-tp4643771.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SQL query with multicore option
Hi all, I have the following sql query that I am executing on a machine with single core. I want to know how can I execute the same sqery on a maching that is running with 4 cores. Please provide me the code. NEW_TABLE - rhive.query(SELECT A, B, COUNT(C) FROM TABLE_A WHERE A='01-01-2012') Also let me know how can I leverage only 2 / 3 cores of the machine. Regards, Madana -- View this message in context: http://r.789695.n4.nabble.com/SQL-query-with-multicore-option-tp4643197.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error in Installing RODBC in Linux R
Hi All, I am trying to install RODBC Package in Linux version of R. PFB the error message. Request all, if you have any solution how to overcome this error. configure: error: ODBC headers sql.h and sqlext.h not found Regards, Madana -- View this message in context: http://r.789695.n4.nabble.com/Error-in-Installing-RODBC-in-Linux-R-tp4641488.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Joining R Local Dataset with Table from Database
Hi All, I want to join a table (Dataset) that is created in R with a table that is in oracle database. Can some one help me in accomplishing this task in R? Example Code: library(RODBC) DB_CONNECT - odbcConnect(DSN_NAME) TABLE_JOIN - sqlQuery(DB_CONNECT, SELECT * FROM DB_TABLE WHERE COL_1 NOT IN (SELECT COL_1 FROM DATA_SET_R) ) Regards, Madana -- View this message in context: http://r.789695.n4.nabble.com/Joining-R-Local-Dataset-with-Table-from-Database-tp4638967.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Binding multiple data frames into single data frame
Hi Joshua, This worked for me. Thanks for your help. Now i have another challenge. Since multicore is creating almost 70K DF's, do.call() function taking huge amount of time to bind all the datasets and i am not able to realize the impact of multicore due to do.call() time consumption. Can you help me in getting this done at faster rate... Regards, Madana From: Joshua Wiley-2 [via R] [ml-node+3691342-1864501667-22...@n4.nabble.com] Sent: Sunday, July 24, 2011 5:43 PM To: Madana Mohana Babu Subject: Re: Binding multiple data frames into single data frame Hi Madana, No example data, so untested, but I think this will do what you want: do.call(rbind, DF) Cheers, Josh On Sun, Jul 24, 2011 at 5:38 PM, Madana_Babu [hidden email]/user/SendEmail.jtp?type=nodenode=3691342i=0 wrote: Hi all, I have multiple data frames created with equal number of columns in each data frame by using mclapply() on multicore processor. The data frames are like DF[[1]], DF[[2]], ... DF[[150]]. Now i want to bind (Similar like rbind()) all these data frames and create one single data frame called DF so that i can have the complete data for further analysis. Can someone help me in performing this function through looking (without manual specification of DF[[count]]s). Thanks in advance for your help. Regards, Madana -- View this message in context: http://r.789695.n4.nabble.com/Binding-multiple-data-frames-into-single-data-frame-tp3691335p3691335.html Sent from the R help mailing list archive at Nabble.com. __ [hidden email]/user/SendEmail.jtp?type=nodenode=3691342i=1 mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles https://joshuawiley.com/ __ [hidden email]/user/SendEmail.jtp?type=nodenode=3691342i=2 mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Binding-multiple-data-frames-into-single-data-frame-tp3691335p3691342.html To unsubscribe from Binding multiple data frames into single data frame, click herehttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=3691335code=bWFkYW5hX2JhYnVAaW5mb3N5cy5jb218MzY5MTMzNXwxMzY2NzI1OTQ5. CAUTION - Disclaimer * This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions are unlawful. This e-mail may contain viruses. Infosys has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. Infosys reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the Infosys e-mail system. ***INFOSYS End of Disclaimer INFOSYS*** -- View this message in context: http://r.789695.n4.nabble.com/Binding-multiple-data-frames-into-single-data-frame-tp3691335p3693863.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Binding multiple data frames into single data frame
This is working for me... however when i use multicore there are more than 60K DF's getting created and when i use do.call() function it is taking huge time. Is there any function which can perform this operation at faster rate? Regards, Madana -- View this message in context: http://r.789695.n4.nabble.com/Binding-multiple-data-frames-into-single-data-frame-tp3691335p3693871.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R on Multicore for Linux
Hi Lei, Thanks for your solution. It worked. Now I have another query. After creating multiple DF[[i]]'s how do I aggregate them into one data frame say DF (I want to bind all the data frames into one data frame). I have more than 1000 DF[[i]]'s how can I bind them into one DF by recursively? Thanks for your help. Regards, Madana From: Lei Jiang [via R] [mailto:ml-node+3688529-54131105-22...@n4.nabble.com] Sent: Saturday, July 23, 2011 1:43 PM To: Madana Mohana Babu Subject: Re: R on Multicore for Linux Madana, The code below may work (untested though): #above is the same as you wrote require(multicore) read.data.exmple - function(f) { dat - read.csv(f, header=FALSE, sep=\t, na.strings=,dec=., strip.white=TRUE, fill=TRUE) data_1 - sqldf(SELECT V2, V14, MIN(V16) FROM dat WHERE V6=104 GROUP BY V2, V14) data_1 } DF - mclapply(a, read.data.example) #you can check the components of DF by DF[[1]], DF[[2]] ..., which is a bit different from rbind #feel free to add more arguments to function read.data.example and add those to mclapply accordingly Hope this helps. Regards, Lei On Fri, Jul 22, 2011 at 11:35 AM, Madana_Babu [hidden email]/user/SendEmail.jtp?type=nodenode=3688529i=0wrote: Hi, Can you please explain me that how can i perform this on a multicore processor? since i have a machine with 16-cores. I can do this much faster if i use all cores. Thanks in advance... Regards, Madana -- View this message in context: http://r.789695.n4.nabble.com/R-on-Multicore-for-Linux-tp3682318p3687483.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ [hidden email]/user/SendEmail.jtp?type=nodenode=3688529i=1 mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Lei Jiang Center for Computation and Technology/ Department of Computer Science Louisiana State University E-mail: [hidden email]/user/SendEmail.jtp?type=nodenode=3688529i=2 [[alternative HTML version deleted]] __ [hidden email]/user/SendEmail.jtp?type=nodenode=3688529i=3 mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/R-on-Multicore-for-Linux-tp3682318p3688529.html To unsubscribe from R on Multicore for Linux, click herehttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=3682318code=bWFkYW5hX2JhYnVAaW5mb3N5cy5jb218MzY4MjMxOHwxMzY2NzI1OTQ5. CAUTION - Disclaimer * This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions are unlawful. This e-mail may contain viruses. Infosys has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. Infosys reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the Infosys e-mail system. ***INFOSYS End of Disclaimer INFOSYS*** -- View this message in context: http://r.789695.n4.nabble.com/R-on-Multicore-for-Linux-tp3682318p3690669.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Binding multiple data frames into single data frame
Hi all, I have multiple data frames created with equal number of columns in each data frame by using mclapply() on multicore processor. The data frames are like DF[[1]], DF[[2]], ... DF[[150]]. Now i want to bind (Similar like rbind()) all these data frames and create one single data frame called DF so that i can have the complete data for further analysis. Can someone help me in performing this function through looking (without manual specification of DF[[count]]s). Thanks in advance for your help. Regards, Madana -- View this message in context: http://r.789695.n4.nabble.com/Binding-multiple-data-frames-into-single-data-frame-tp3691335p3691335.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R on Multicore for Linux
Hi, Can you please explain me that how can i perform this on a multicore processor? since i have a machine with 16-cores. I can do this much faster if i use all cores. Thanks in advance... Regards, Madana -- View this message in context: http://r.789695.n4.nabble.com/R-on-Multicore-for-Linux-tp3682318p3687483.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R on Multicore for Linux
Hi all, Currently i am trying to this on R which is running on multicore processor. I am not sure how to use mclapply() function on this task. Can anyone help me. # Setting up directory setwd(/XXX////2011/07/20) library(sqldf) # Data is available in the form of multiple structured log files (nearly 10K log files) # I am using the following syntax to get required fields and aggregations from the logs and creating a file called DF (with 3 columns V2, V14 and Min(V16)) a - list.files(path = ., pattern = 2011-07-20, all.files = FALSE, full.names = FALSE, recursive = FALSE, ignore.case = FALSE) DF - NULL for (f in a) { dat - read.csv(f, header=FALSE, sep=\t, na.strings=,dec=., strip.white=TRUE, fill=TRUE) data_1 - sqldf(SELECT V2, V14, MIN(V16) FROM dat WHERE V6=104 GROUP BY V2, V14) DF - rbind(DF, data_1) } # Currently this process is taking almost 3 Hrs for me. Can anyone help me to use mclapply() on this operation and get this process completed asap. Request you to provide me the syntax. Thanks in advance Regards, Madana -- View this message in context: http://r.789695.n4.nabble.com/R-on-Multicore-for-Linux-tp3682318p3684736.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R with Multicore running on Linux
Hi all, Currently i am trying this on R which is running on multicore processor. I am not sure how to use mclapply() function on this task. Can anyone help me. # Setting up directory setwd(/XXX////2011/07/20) library(sqldf) # Data is available in the form of multiple structured log files (nearly 10K log files) # I am using the following syntax to get required fields and aggregations from the logs and creating a file called DF (with 3 columns V2, V14 and Min(V16)) a - list.files(path = ., pattern = 2011-07-20, all.files = FALSE, full.names = FALSE, recursive = FALSE, ignore.case = FALSE) DF - NULL for (f in a) { dat - read.csv(f, header=FALSE, sep=\t, na.strings=,dec=., strip.white=TRUE, fill=TRUE) data_1 - sqldf(SELECT V2, V14, MIN(V16) FROM dat WHERE V6=104 GROUP BY V2, V14) DF - rbind(DF, data_1) } # Currently this process is taking almost 3 Hrs for me. Can anyone help me to use mclapply() on this operation and get this process completed asap. Request you to provide me the syntax. Thanks in advance Regards, Madana -- View this message in context: http://r.789695.n4.nabble.com/R-with-Multicore-running-on-Linux-tp3685137p3685137.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R on Multicore for Linux
Hi all, I have R installed on a box, which is running on a machine with 16 core and Redhat - Linux. I am handling huge (size of dataset will be 5 GB) dataset. Lets assume that my data is in the form of structured (multiple) logs. I access the data by using all.files(). Since by default basic version of R utilizes single core, the processing of my analysis code is taking too much time. I got to know that mclapply() can be used to use all cores (processors) to make R much faster when we have multicores. Can anyone help me in understanding how to use mclapply() function in the above situation. Thanks in advance Regards, Madana -- View this message in context: http://r.789695.n4.nabble.com/R-on-Multicore-for-Linux-tp3682318p3682318.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splitting one column value into multiple rows
Hi David, PFB the details of my query. Request your help in getting this resolved. # TESTING is my dataset with almost 40K rows. I am importing this dataset from my local desktop TESTING - read.table(/Users/madana/Desktop/testing.txt, header=FALSE, sep=\t, na.strings=, dec=., strip.white=TRUE) TESTING # I tried the following two ways. Let me know if i am using right syntax. Lines - readLines(textConnection(data.frame(TESTING$V1))) # Error message is: Error in textConnection(data.frame(TESTING$V1)) : invalid 'text' argument Lines - readLines(textConnection(data.frame(TESTING, header=FALSE, sep=\t, na.strings=, dec=., strip.white=TRUE))) # Error message is: Error in textConnection(data.frame(TESTING, header = FALSE, sep = \t, : argument 'object' must deparse to a single character string closeAllConnections() newlines - strsplit(Lines, :) # Error message is: Error in strsplit(Lines, :) : non-character argument newlines2 - unlist(newlines) cleaned_data - read.table(textConnection(newlines2), sep=,) # Error message is: Error in textConnection(newlines2) : invalid 'text' argument My machine Config is: Dual Core. Thanks Regards, Madana -- View this message in context: http://r.789695.n4.nabble.com/Splitting-one-column-value-into-multiple-rows-tp3668835p3674386.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splitting one column value into multiple rows
Hi, This is working with when i have few lines and when i give those input lines in R window. But i want to apply this function on a variable which is a part of dataset and the data set is very large in size. Any help in this aspect will really help me a lot. Regards, Madana -- View this message in context: http://r.789695.n4.nabble.com/Splitting-one-column-value-into-multiple-rows-tp3668835p3671087.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Splitting one column value into multiple rows
Hi i have the data in the following format: rent,100,1,common,674 pipe,200,0,usual,864 car,300,1,uncommon,392:jump,700,0,common,664 car,200,1,uncommon,864:snap,900,1,usual,746 stint,600,1,uncommon,257 pull,800,0,usual,594 where as i want the above 6 lines data into 8 lines as below (Spliting row 3 4 at : and sending to a new row): rent,100,1,common,674 pipe,200,0,usual,864 car,300,1,uncommon,392 jump,700,0,common,664 car,200,1,uncommon,864 snap,900,1,usual,746 stint,600,1,uncommon,257 pull,800,0,usual,594 Request any one who can help me getting this done. Regards, Madana -- View this message in context: http://r.789695.n4.nabble.com/Splitting-one-column-value-into-multiple-rows-tp3668835p3668835.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.