Hi Xiao! On Thursday 03 Dec 2009 14:19:10 Orchid Fairy (兰花仙子) wrote: > Thanks all. > How about the files parsing with huge size (about 1T of each day)? > > The basic logic is: > > reach each line of every file (many files, each is gziped) > look for special info (like IP, request url, session_id, datetime etc). > count for them and write the result into a database > generate the daily report and monthly report > > I'm afraid perl can't finish the daily job so I want to know the speed > difference between perl and C for this case. >
OK, I assume you've tested the Perl code. If not - try it, because writing it in Perl would take much less time than writing it in C and would also serve as a useful prototype. I can imagine that Perl would be unable to handle such load. However, with 1 TB of gzipped data, it's very possible your problem is bound by the I/O and gunzip-ing constraints. If so, you may need to throw better iron at the problem. I know many insurance companies / banks / etc. are still using IBM mainframe machines (zSeries, iSeries, etc.) because they have really good I/O which can not be easily worked around using commodity PC hardware. (I'm not suggesting you buy something in that excess, but you may have to buy better hardware of a similar form.) That put aside, if you still want to try writing the C or C++ program, then I suggest looking at some of the following abstraction libraries: http://www.shlomifish.org/open-source/portability-libs/ They are helpful and provides similar APIs to the Perl built-ins and also some CPAN APIs, and allow you to write Perl-like code in C, without having to implement the lower-level details yourself (while still being more wordy, verbose and with a less idiomatic syntax than with Perl, but that is expected of C and C++). Due to the fact they are generic and written from a general purpose in mind, they may incur a small run-time overhead, but I doubt it will break you in most cases. Regards, Shlomi Fish > // Xiao lan > -- ----------------------------------------------------------------- Shlomi Fish http://www.shlomifish.org/ Best Introductory Programming Language - http://shlom.in/intro-lang Chuck Norris read the entire English Wikipedia in 24 hours. Twice. -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/