[R] writing output directly to file; sink?
Dear all, I am working with large matrices (19.6 million elements * 1000 simulations) and am trying to get around memory problems and vector length issues. I’ve split the inputs so that the output vector length will not exceed 2^31. Working on a 64bit machine with 80GB RAM, I still get close to the memory limits when allowing output to be held in memory (as would be expected). Originally, I planned to write the output to a file after it was produced in memory. It took 15min for the output to be produced, but now it's been working on writing it to a file for almost an hour (and ongoing). Is the recommended way to manage large output like this to write it directly to a file? Can that be done as the output is produced, so that memory usage does not build (i.e., so it’s not storing it in memory)? Is that what Sink is designed to do? I’ve been trying to find information about this on the help archive as well as the R http://stat.ethz.ch/R-manual/R-devel/doc/manual/R-data.html Data Import/Export Manual but it’s still not clear to me. Many thanks for your guidance. Some more info: dim(stPte801) NULL #it's a vector length(stPte801) [1] 1965705000 write(stPte801, stPte801.txt, sep=\n) #it's been writing for almost an hour... #eventually, I will need to pull it back into R to do the next step (but after the other variables are created) -- View this message in context: http://r.789695.n4.nabble.com/writing-output-directly-to-file-sink-tp4506432p4506432.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] writing output directly to file; sink?
Have you tried 'save' to save the output. 'write' is probably spending a lot of time converting to character. Are you just going to read it back into R for processing; if so 'save' will probably be faster. On Mon, Mar 26, 2012 at 12:53 PM, Diann Prosser dpros...@usgs.gov wrote: Dear all, I am working with large matrices (19.6 million elements * 1000 simulations) and am trying to get around memory problems and vector length issues. I’ve split the inputs so that the output vector length will not exceed 2^31. Working on a 64bit machine with 80GB RAM, I still get close to the memory limits when allowing output to be held in memory (as would be expected). Originally, I planned to write the output to a file after it was produced in memory. It took 15min for the output to be produced, but now it's been working on writing it to a file for almost an hour (and ongoing). Is the recommended way to manage large output like this to write it directly to a file? Can that be done as the output is produced, so that memory usage does not build (i.e., so it’s not storing it in memory)? Is that what Sink is designed to do? I’ve been trying to find information about this on the help archive as well as the R http://stat.ethz.ch/R-manual/R-devel/doc/manual/R-data.html Data Import/Export Manual but it’s still not clear to me. Many thanks for your guidance. Some more info: dim(stPte801) NULL #it's a vector length(stPte801) [1] 1965705000 write(stPte801, stPte801.txt, sep=\n) #it's been writing for almost an hour... #eventually, I will need to pull it back into R to do the next step (but after the other variables are created) -- View this message in context: http://r.789695.n4.nabble.com/writing-output-directly-to-file-sink-tp4506432p4506432.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] writing output directly to file; sink?
Thank you, Jim! I was just trying this as you wrote. Testing it on a small sample - it seems to work! I am curious - I removed the data that was stored in memory (using rm()), and checked to see that it was gone using ls() - and it was; but I didn't see a concurrent reduction in the Memory Usage on my Performance Window. Did the rm() actually free up memory (it didn't seem to)? Many thanks for your response! PS - for any interested, the sink function did not help. My vector was longer that what is printed, so when I checked the file, only part of the data was there. But save definitely seems to be the way to go here. -- View this message in context: http://r.789695.n4.nabble.com/writing-output-directly-to-file-sink-tp4506432p4507059.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] writing output directly to file; sink?
And to answer your question, Jim, Are you just going to read it back into R for processing? if so 'save' will probably be faster. Yes, that is my intent. This is a great solution - infinitely better than the writing I was trying to do. -- View this message in context: http://r.789695.n4.nabble.com/writing-output-directly-to-file-sink-tp4506432p4507105.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] writing output directly to file; sink?
'sink' would probably not help since it is just capturing output to the console and you are still doing the binary to character conversion. 'save' helps to avoid that. You might want to see if there is any difference with using compression (which I think is the default for 'save') as opposed to writing directly. This is a time vs. space tradeoff and with the size of your object, you may want compression, but try it and see if it makes a difference. On Mon, Mar 26, 2012 at 4:08 PM, Diann Prosser dpros...@usgs.gov wrote: Thank you, Jim! I was just trying this as you wrote. Testing it on a small sample - it seems to work! I am curious - I removed the data that was stored in memory (using rm()), and checked to see that it was gone using ls() - and it was; but I didn't see a concurrent reduction in the Memory Usage on my Performance Window. Did the rm() actually free up memory (it didn't seem to)? Many thanks for your response! PS - for any interested, the sink function did not help. My vector was longer that what is printed, so when I checked the file, only part of the data was there. But save definitely seems to be the way to go here. -- View this message in context: http://r.789695.n4.nabble.com/writing-output-directly-to-file-sink-tp4506432p4507059.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] writing output directly to file; sink?
Testing it on the full 1000 simulations: -used 54GB RAM. -after saving then removing - using rm() - the memory usage did decrease significantly. -I guess my test run was too small to show a noticeable difference. -thanks again. -- View this message in context: http://r.789695.n4.nabble.com/writing-output-directly-to-file-sink-tp4506432p4507167.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] writing output directly to file; sink?
Sorry- I was so excited it was working that I didn't see your reply: What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. Answer is that I was trying to figure out the most efficient way (mainly in time moreso than physical space) to save my output so I could clear the memory so R could move on to processing the next variable. In the end, I will need to pull the variables back in and do further work on them. Your posts were very helpful. I now understand what sink was doing which was not accomplishing my goal. Save is working very well, and I'm able to clear the memory before moving to the next step. Thanks a million. -- View this message in context: http://r.789695.n4.nabble.com/writing-output-directly-to-file-sink-tp4506432p4507197.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing output into a file
Hi everyone, I tried writing this data into a file using the save(myList, file=test1.bin) command, but unfortunately, the numerical values seem to get garbled when I do so. The numbers in my RGui look like 0, 0.5, 0, 1 etc. etc. But when I stored it into a .bin file, and retrieved it using java code, it returns data such as, 2272919233031569408 1701436416123530 -2278152494445862686 7161955281552955800 Etc. etc. I also tried the second method (using a # Open a file connection) Unfortunately, here too the data gets extremely garbled. Has anyone faced such a situation before? Any help / comments / useful links would be much appreciated Thanks and best regards, Suranga On Mon, Feb 13, 2012 at 10:37 AM, Suranga Kasthurirathne suranga...@gmail.com wrote: Hi, Thank you very much for sharing these ideas. I really appreciate them. Let me go try them out :-) On Mon, Feb 13, 2012 at 4:37 AM, Rui Barradas rui1...@sapo.pt wrote: Hello One way is # Write the file save(myList, file=test1.bin) # Reload the data, under the same name, 'myList' load(file=test1.bin) Another way is a bit more complicated # Open a file connection and write the list to it (using comma as separator) fileCon - file(test2.txt, open=wt) lapply(myList, function(x) writeLines(paste(x, collapse=,), con=fileCon)) close(fileCon) # Load the data, maybe under another name strsplit(readLines(con=test2.txt), split=,) If you use the first method, the list is retrieved as it was. If you use the second, you lose the list's members' names. Hope this helps, Rui Barradas -- View this message in context: http://r.789695.n4.nabble.com/Writing-output-into-a-file-tp4382243p4382310.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Best Regards, Suranga -- Best Regards, Suranga -- Best Regards, Suranga [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing output into a file
Hello, I tried writing this data into a file using the save(myList, file=test1.bin) command, but unfortunately, the numerical values seem to get garbled when I do so. The numbers in my RGui look like 0, 0.5, 0, 1 etc. etc. But when I stored it into a .bin file, and retrieved it using java code, it returns data such as, The problem should be in the use of java, 'save' uses a R format , RDA. You can use 'ascii=TRUE'and see it with a text editor. Also see ?save I also tried the second method (using a # Open a file connection) Unfortunately, here too the data gets extremely garbled. Don't understand why, check the output file with a text editor and let us know what is wrong. The problem I've seen is that the use of 'strsplit' coerses the numeric data to character, but this is easy to solve. Does your list have sub-lists? Rui Barradas -- View this message in context: http://r.789695.n4.nabble.com/Writing-output-into-a-file-tp4382243p4383741.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing output into a file
Hi, many thanks for the reply. I really appreciate it. Since I'm still very new to R, I think I should take some time to research what you suggested. (I don't want to keep posting basic questions to the list all the time) But still, thank you so much for being helpful... On Mon, Feb 13, 2012 at 7:23 PM, Rui Barradas rui1...@sapo.pt wrote: Hello, I tried writing this data into a file using the save(myList, file=test1.bin) command, but unfortunately, the numerical values seem to get garbled when I do so. The numbers in my RGui look like 0, 0.5, 0, 1 etc. etc. But when I stored it into a .bin file, and retrieved it using java code, it returns data such as, The problem should be in the use of java, 'save' uses a R format , RDA. You can use 'ascii=TRUE'and see it with a text editor. Also see ?save I also tried the second method (using a # Open a file connection) Unfortunately, here too the data gets extremely garbled. Don't understand why, check the output file with a text editor and let us know what is wrong. The problem I've seen is that the use of 'strsplit' coerses the numeric data to character, but this is easy to solve. Does your list have sub-lists? Rui Barradas -- View this message in context: http://r.789695.n4.nabble.com/Writing-output-into-a-file-tp4382243p4383741.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Best Regards, Suranga [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Writing output into a file
Hi everyone, I'm an R newbie working with the poLCA module. I achieved my target without having to bother anyone, but It seems that I've got stuck at the last minute. My problem is simple. I need to write my results into a file. My results are in the shape of a list (unbalanced columns) I've considered several methods (sink(), write.file) etc. etc. Unfortunately, I'm not the best brains in the market on this subject. I've also faced some difficulty in converting the list so that it can be written using write.file(). Therefore, I'm wondering if anyone can point me towards a good example that shows me how to write a list into a file safely. -- Thanks and Best Regards, Suranga [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing output into a file
Hi, could you paste the results? Alfredo 2012/2/12 Suranga Kasthurirathne suranga...@gmail.com: Hi everyone, I'm an R newbie working with the poLCA module. I achieved my target without having to bother anyone, but It seems that I've got stuck at the last minute. My problem is simple. I need to write my results into a file. My results are in the shape of a list (unbalanced columns) I've considered several methods (sink(), write.file) etc. etc. Unfortunately, I'm not the best brains in the market on this subject. I've also faced some difficulty in converting the list so that it can be written using write.file(). Therefore, I'm wondering if anyone can point me towards a good example that shows me how to write a list into a file safely. -- Thanks and Best Regards, Suranga [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing output into a file
Hello One way is # Write the file save(myList, file=test1.bin) # Reload the data, under the same name, 'myList' load(file=test1.bin) Another way is a bit more complicated # Open a file connection and write the list to it (using comma as separator) fileCon - file(test2.txt, open=wt) lapply(myList, function(x) writeLines(paste(x, collapse=,), con=fileCon)) close(fileCon) # Load the data, maybe under another name strsplit(readLines(con=test2.txt), split=,) If you use the first method, the list is retrieved as it was. If you use the second, you lose the list's members' names. Hope this helps, Rui Barradas -- View this message in context: http://r.789695.n4.nabble.com/Writing-output-into-a-file-tp4382243p4382310.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing output into a file
Hi, Thank you very much for sharing these ideas. I really appreciate them. Let me go try them out :-) On Mon, Feb 13, 2012 at 4:37 AM, Rui Barradas rui1...@sapo.pt wrote: Hello One way is # Write the file save(myList, file=test1.bin) # Reload the data, under the same name, 'myList' load(file=test1.bin) Another way is a bit more complicated # Open a file connection and write the list to it (using comma as separator) fileCon - file(test2.txt, open=wt) lapply(myList, function(x) writeLines(paste(x, collapse=,), con=fileCon)) close(fileCon) # Load the data, maybe under another name strsplit(readLines(con=test2.txt), split=,) If you use the first method, the list is retrieved as it was. If you use the second, you lose the list's members' names. Hope this helps, Rui Barradas -- View this message in context: http://r.789695.n4.nabble.com/Writing-output-into-a-file-tp4382243p4382310.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Best Regards, Suranga [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.