Hi Parag! On Saturday 02 Jan 2010 19:56:02 Parag Kalra wrote: > Hello All, > > Major part of my Perl scripting goes in processing text files. And most of > the times I need huge sized text files ( 3 MB +) to perform benchmarking > tests. > > So I am planing to write a Perl script which will create huge sized text > file of the sample file which it will receive as first Input parameter. I > have following algorithm in mind: > > 1. Provide 2 input parameters to the Perl script - (i) Sample file, (ii) > Size of the new file > EG: - To create a new file of size 3 MB - > perl Create_Huge_File.pl Sample.txt 3 > > 2. Read the input file and store the contents into an array. >
Why an array? Storing it into a single string would be more faster, conserve more memory and be more efficient. See: http://www.perl.com/pub/a/2003/11/21/slurp.html > 3. Create a new file. > > 4. Dump the contents of the above array into the new file. > Again string. > 5. Check the length of the new file. If it is less than second input > parameter, repeat step 4 or else goto step 6. > You can calculate the existing length in a variable or use http://perldoc.perl.org/5.8.8/functions/tell.html . > 6. Close the new file. > OK. > I have following questions: > > a.) What do I need to do to make sure that length of new file will increase > every time the step 4 is executed. Nothing. Just print to the output file-handle and it will append to the file's contents and will increase its size. > > b.) Since lot of I/O is involved is it the most optimised solution? If not, > does any one has any better design to suffice my requirement. It should be good enough. Perl does I/O quickly. > > c.) What are the likely bugs that may creep in with this algorithm. > Encoding problems, etc. Logistical problems. I should note that, in general, your algorithm will produce repetitive text with very little Entropy: http://en.wikipedia.org/wiki/Entropy_%28information_theory%29 One option you may wish to take instead is to chain several different texts from sources of free online texts such as http://www.gutenberg.org/ or http://wikisource.org/ (and see also http://www.google.com/search?q=free%20online%20books ). Regards, Shlomi Fish -- ----------------------------------------------------------------- Shlomi Fish http://www.shlomifish.org/ What Makes Software Apps High Quality - http://shlom.in/sw-quality Bzr is slower than Subversion in combination with Sourceforge. ( By: http://dazjorz.com/ ) -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/