I wouldn't use a DBMS at all -- it is not necessary and I don't see what you would get in return. Instead I would split very large log files into a number of pieces so that each piece fits in memory (see below for an example), then process them in a loop. See the list and the documentation if you have questions about how to read text files, count strings etc.
#---split big files in two--- for F in `ls *log` do fn=`echo $F | awk -F\. '{print $1}'` ln=`wc -l $F | awk '{print $1}'` #number of lines in the file forsplit=`expr $ln / 2 + 50` #no. of lines in each chunk, tweak as needed echo Splitting $F into pieces of $forsplit lines each........ split -l $forsplit $F $fn done > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Gabriel Diaz > Sent: Monday, June 12, 2006 9:52 AM > To: Jean-Luc Fontaine > Cc: r-help@stat.math.ethz.ch > Subject: Re: [R] R usage for log analysis > > Hello > > Thanks all for the answers. > > I'm taking an overview to the project documentation, and seems the > database is the way to go to handle log files of GB order (normally > between 2 and 4 GB each 15 day dump). > > In this document http://cran.r-project.org/doc/manuals/R-data.html, > says R will load all data into memory to process it when using > read.table and such. Using a database will do the same? Well, > currently i have no machine with > 2 GB of memory. > > The moodss thing looks nice, thanks for the link. But what i have to > do now is an offline analysis of big log files :-). I will try to go > with the mysql -> R way. > > gabi > > > > On 6/12/06, Jean-Luc Fontaine <[EMAIL PROTECTED]> wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > > Hash: SHA1 > > > > Allen S. Rout wrote: > > > > > > > > > Don't expect a warm welcome. This community is like all > open-source > > > communities, sharply focused on its' own concerns and > expertise. And, > > > in an unusual experience for computer types, our core competencies > > > hold little or no sway here; they don't even give us much > of a leg up. > > > Just wait 'till you want to do something nutso like > produce a business > > > graphic. :) > > > > > > I'm working on understanding enough of R packaging and > documentation > > > to begin a 'task view' focused on systems administration, > for humble > > > submission. That might end up being mostly "log > analysis"; the term > > > can describe much of what we do, if it's stretched a bit. > I'm hoping > > > the task view will attract the teeming masses of > sysadmins trapped in > > > the mire of Gnuplot and friends. > > Although not specifically solving the problem at hand, you > might want > > to take a look at moodss and moomps > (http://moodss.sourceforge.net/), > > modular monitoring applications, which uses R > > (http://jfontain.free.fr/statistics.htm) and its log module > > (http://jfontain.free.fr/log/log.htm). > > > > - -- > > Jean-Luc Fontaine http://jfontain.free.fr/ > > -----BEGIN PGP SIGNATURE----- > > Version: GnuPG v1.4.3 (GNU/Linux) > > Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org > > > > iD8DBQFEjT2ykG/MMvcT1qQRAuF6AJ9nf5phV/GMmCHPuc5bVyA+SoXqGACgnLuZ > > u1tZpFOTCHNKOfFLZOC9uXI= > > =V8yo > > -----END PGP SIGNATURE----- > > > > ______________________________________________ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html