a reprise 2009/3/23 Avishalom Shalit <[email protected]>: > well, Matlab has a limited embedded perl interpreter , so > theoretically you could do this from within. > but i find it easier to work outside of matlab with perl > > let me give you a use case or two. > > imagine a standard server log file that contains a url, a referrer and > an ip address, > now imagine it is 1GB in size, > and you want to ask some questions. > > a- map the traffic inside the website citing both unique visitations > and total visitations. > example question, find most frequent cycle of pages larger than 5 links. > b- plot the histograms for referrers , page hits and ip activity > > so. once you have the data in the right format it is easy with matlab. > BUT, for matlab is a bit heavy and slow on text processing, > (especially if the delimiting character isn't a space) > > so in this case i would use perl to create 2 or three dictionaries. > e.g. > > perl -F, -anle '$urlnum{$F[0]}=$pagecounter and > $urldict[$pagecounter]=$F[0] and $pagecounter++ unless > $pagecounter{$F[0]}++; .... print $urldict[$F[0]] .....}' > and sometimes even -i > > (or i would have used a file with "strict" and "my" of course :-) ) > > now i have a lookup file (because i printed it in the END to a file) > 0 www.google.co > 1 www.bbc.co.uk > etc . > and another > > 0 123.123.123.123 > 1 321.321.321.321 > > etc. > > > and the main log file looks like this > 0 1 0 > 0 1 0 > 0 1 1 > 2 1 0 > 2 1 2 > 3 0 2 > > ...... > this , matlab slurps in a second, and recognizes as numerical data > (even if bitwise it strings, i.e. a text file. ) > if your urls contain some non english charaters (?query=שדג) > you have no choice even if you were willing to let matlab sweat some strings > > the dictionary files are much smaller now and pose no problem to read > and use as labels. > > {then, using either accumarray, or sparse, i get this into an > adjacency matrix inside matlab. etc } > > > ------ > another use case would be a matlab script that does a certain > computation (that may take an a few hours) over different parameters > this could be done internally , but has some advantages to do it from > an outside script, e.g. multi cores , unstable machines that drop your > computation (because someone didn't plug the fan in and it got hot) > > 2009/3/23 Gabor Szabo <[email protected]>: >> On Thu, Mar 12, 2009 at 2:29 PM, Avishalom Shalit <[email protected]> >> wrote: >>> to cover a different benefit. >> >>> >>> I have often found myself preformatting data files (for example to be >>> used in matlab) with perl. >>> i may have been able to do this with awk, but i am not fluent in awk. >> >> I never used Matlab but I often encounter people in my classes who are >> talking >> about using Perl and Matlab together. So far I have not managed to >> understand >> what this means. I'd really appreciate if you wrote a couple of examples on >> how you used the two together (and why :-). >> >> >> reagards >> Gabor >> >> -- >> Gabor Szabo http://szabgab.com/blog.html >> Perl Training in Israel http://www.pti.co.il/ >> Test Automation Tips http://szabgab.com/test_automation_tips.html >> _______________________________________________ >> Perl mailing list >> [email protected] >> http://mail.perl.org.il/mailman/listinfo/perl >> > > > > -- > -- vish >
-- -- vish _______________________________________________ Perl mailing list [email protected] http://mail.perl.org.il/mailman/listinfo/perl
