Re: check html file size

2005-10-07 Thread Xah Lee
Xah Lee wrote: « would anyone like to translate the following perl
script to Python or Scheme (scsh)?»

Here's the Python version.

# -*- coding: utf-8 -*-
# Python


# Wed Oct  5 15:50:31 PDT 2005
# given a dir, report all html file's size. (counting inline images)
# XahLee.org

import re, os.path, sys

inpath= '/Users/t/web/'

while inpath[-1] == '/': inpath = inpath[0:-1] # get rid of trailing
slash

if (not os.path.exists(inpath)):
print dir  + inpath +  doesn't exist!
sys.exit(1)

##
# subroutines


def getInlineImg(file_full_path):
'''getInlineImg($file_full_path) returns a array that is a list of
inline images. For example, it may return ['xx.jpg','../image.png']'''

FF = open(file_full_path,'rb')
txt_segs = re.split( r'src', unicode(FF.read(),'utf-8'))
txt_segs.pop(0)
FF.close()
linx=[]
for linkBlock in txt_segs:
matchResult = re.search(r'\s*=\s*\([^\]+)\', linkBlock)
if matchResult: linx.append( matchResult.group(1) )
return linx


def linkFullPath(dir,locallink):
'''linkFullPath(dir, locallink) returns a string that is the full
path to the local link. For example,
linkFullPath('/Users/t/public_html/a/b', '../image/t.png') returns
'Users/t/public_html/a/image/t.png'. The returned result will not
contain double slash or '../' string.'''
result = dir + '/' + locallink
result = re.sub(r'//+', r'/', result)
while re.search(r'/[^\/]+\/\.\.', result): result =
re.sub(r'/[^\/]+\/\.\.', '', result)
return result

def listInlineImg(htmlfile):
'''listInlineImg($html_file_full_path) returns a array where each
element is a full path to inline images in the html.'''
dir=os.path.dirname(htmlfile)
imgPaths = getInlineImg(htmlfile)
result = []
for aPath in imgPaths:
result.append(linkFullPath( dir, aPath))
return result


##
# main

fileSizeList=[]
def checkLink(dummy, dirPath, fileList):
for fileName in fileList:
if '.html' == os.path.splitext(fileName)[1] and
os.path.isfile(dirPath+'/'+fileName):
totalSize = os.path.getsize(dirPath+'/'+fileName)
imagePathList = listInlineImg(dirPath+'/'+fileName)
for imgPath in imagePathList: totalSize +=
os.path.getsize(imgPath)
fileSizeList.append([totalSize, dirPath+'/'+fileName])


os.path.walk(inpath, checkLink, 'dummy')

fileSizeList.sort(key=lambda x:x[0],reverse=True)

for it in fileSizeList: print it
print done reporting.



-
This Python version is a direct translation of the Perl version. They
match pretty much line by line.

for both the Python version and the Perl version, see:
 http://xahlee.org/perl-python/check_html_size.html

Would any lisper provides a Scheme version? i don't think i'll do a
Scheme version anytime soon. Please, Schemers, show us some fanfare.

 Xah
 [EMAIL PROTECTED]
∑ http://xahlee.org/

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: check html file size

2005-10-06 Thread Ulrich Hobelmann
Sherm Pendley wrote:
 I'm guessing you didn't get the joke then. I think Richard's response was a
 parody of Xah's style - a funny parody, at that.

If you take all the line noise in Perl as swearing ;)
I suppose I'm lucky I can't read it.

-- 
We're glad that graduates already know Java,
so we only have to teach them how to program.
somewhere in a German company
(credit to M. Felleisen and M. Sperber)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: check html file size

2005-10-06 Thread Richard Gration
On Wed, 05 Oct 2005 20:39:18 -0400, Sherm Pendley wrote:

 Richard Gration [EMAIL PROTECTED] writes:
 
 Are you fucking seriously fucking expecting some fucking moron to
 translate your tech geeking fucking code moronicity? Fucking try writing
 it fucking properly in fucking Perl first.
 
 Good fucking job! That's the funniest fucking response I've ever fucking seen
 to Xah's fucking moronistic fucking nonsense.

Thanks, Sherm. I knew someone would get it. I think Bear and Ulrich
haven't yet been exposed to Xah in full effect ;-) They're probably
denizens of the Scheme group which seems to be a new entry on Xah's this
newsgroup needs spamming list ;-)

 Lenny Bruce would be so fucking proud.

LOL
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: check html file size

2005-10-05 Thread Richard Gration
On Tue, 04 Oct 2005 17:44:02 -0700, Xah Lee wrote:

 would anyone like to translate the following perl script to Python or
 Scheme (scsh)?

Are you fucking seriously fucking expecting some fucking moron to
translate your tech geeking fucking code moronicity? Fucking try writing
it fucking properly in fucking Perl first.

-- 
I guess everybody's the same: Gotta be good at your job before you can enjoy 
the rest of your life
-- Cole Trickle

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: check html file size

2005-10-05 Thread Sherm Pendley
Ulrich Hobelmann [EMAIL PROTECTED] writes:

 Richard Gration wrote:
 Are you fucking seriously fucking expecting some fucking moron to
 translate your tech geeking fucking code moronicity? Fucking try writing
 it fucking properly in fucking Perl first.

 Fucking excuse me?

 Fucking maybe you should fucking go fucking fuck your fucking self...

 Seriously, Xah might be a troll, but this is just pathetic.

I'm guessing you didn't get the joke then. I think Richard's response was a
parody of Xah's style - a funny parody, at that.

sherm--

-- 
Cocoa programming in Perl: http://camelbones.sourceforge.net
Hire me! My resume: http://www.dot-app.org
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: check html file size

2005-10-05 Thread Sherm Pendley
Richard Gration [EMAIL PROTECTED] writes:

 Are you fucking seriously fucking expecting some fucking moron to
 translate your tech geeking fucking code moronicity? Fucking try writing
 it fucking properly in fucking Perl first.

Good fucking job! That's the funniest fucking response I've ever fucking seen
to Xah's fucking moronistic fucking nonsense.

Lenny Bruce would be so fucking proud.

sherm--

-- 
Cocoa programming in Perl: http://camelbones.sourceforge.net
Hire me! My resume: http://www.dot-app.org
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: check html file size

2005-10-05 Thread Ray Dillinger
Richard Gration wrote:

 ... fucking ... fucking ... fucking ... fucking ... Fucking ... fucking 
  ... fucking

My friend, you can learn to use a far richer vocabulary of
obscenities.  If your creative flow is blocked by the fear
that you can't spell more dirty words correctly, you can
dispel this fear with a few evenings of study and preparation.

Amaze your friends!  Amuse your enemies!  Enrich your
vocabulary!  You can learn the joys of cussing seven
times in the same sentence without resorting to repetition!
For extra points, and with suitable study, you can even
learn to write entire paragraphs of _original_ obscenity!

Just imagine how much clearer your point would have been if
you'd called him a jizz-licking dogcock grabber!  Why insult
his code with a vague word like moronicity when you could
use steaming pile of entrails or better yet, bucket of
fermented ballsweat? wouldn't that have made your technical
point much clearer?

Now go, and don't attempt obscenity in public again until
you learn how.

Bear


-- 
http://mail.python.org/mailman/listinfo/python-list


check html file size

2005-10-04 Thread Xah Lee
would anyone like to translate the following perl script to Python or
Scheme (scsh)?

the file takes a inpath, and report all html files in it above certain
size. (counting inline images)
also print a sorted report of html files and their size.

(a copy of the script is here:
http://xahlee.org/_scripts/check_file_size.pl
)

 Xah
 [EMAIL PROTECTED]
∑ http://xahlee.org/


# perl

# Tue Oct  4 14:36:48 PDT 2005
# given a dir, report all html file's size. (counting inline images)
# XahLee.org

use Data::Dumper;
use File::Find;
use File::Basename;

$inpath = '/Users/t/web/mydirectory/';
$sizeLimit = 800 * 1000;

# $inpath = $ARGV[0]; # should give a full path; else the
$File::Find::dir won't give full path.
while ($inpath =~ [EMAIL PROTECTED](.+)/$@) { $inpath = $1;} # get rid of 
trailing
slash

die dir $inpath doesn't exist! $! unless -e $inpath;


##
# subroutines


# getInlineImg($file_full_path) returns a array that is a list of
inline images. For example, it may return ('xx.jpg','../image.png')
sub getInlineImg ($) { $full_file_name= $_[0];
@linx =(); open (FF, $full_file_name) or die error: can not open
$full_file_name $!;
while (FF) { @txt_segs = split(m/img/, $_); shift @txt_segs;
for $lin (@txt_segs) { if ($lin =~ m@ src\s*=\s*\([^\]+)\@i) 
{
push @linx, $1; }}
} close FF;
return @linx;
}

# linkFullPath($dir,$locallink) returns a string that is the full path
to the local link. For example,
linkFullPath('/Users/t/public_html/a/b', '../image/t.png') returns
'Users/t/public_html/a/image/t.png'. The returned result will not
contain double slash or '../' string.
sub linkFullPath($$){ $result=$_[0] . $_[1]; while ($result =~
[EMAIL PROTECTED]/\/@\/@) {}; while ($result =~ s@/[^\/]+\/\.\.@@) {}; return
$result;}


# listLocalLinks($html_file_full_path) returns a array where each
element is a full path of local links in the html.
sub listLocalLinks($) {
my $htmlfile= $_[0];

my ($name, $dir, $suffix) = fileparse($htmlfile, ('\.html') );
my @aa = getlinks($htmlfile);
@aa = grep(!m/\#/, @aa);
  @aa = grep (!m/^mailto:/, @aa);
  @aa = grep (!m/^http:/, @aa);

my @linkedFiles=();
foreach my $lix (@aa) { push @linkedFiles, linkFullPath($dir,$lix);}
return @linkedFiles;
}


# listInlineImg($html_file_full_path) returns a array where each
element is a full path to inline images in the html.
sub listInlineImg($) {
my $htmlfile= $_[0];

my ($name, $dir, $suffix) = fileparse($htmlfile, ('\.html') );
my @aa = getInlineImg($htmlfile);

my @result=();
foreach my $ele (@aa) { push @result, linkFullPath($dir,$ele);}
return @result;
}

##
sub checkLink {
if (
-T $File::Find::name
 $File::Find::name =~ [EMAIL PROTECTED]@
) {
$total= -s $File::Find::name;
@h2 = listInlineImg($File::Find::name);
for my $ln (@h2) {$total += -s $ln;};
if ( $total  $sizeLimit) {print problem: file:
$File::Find::name, size: $total\n;}

push (@result, [$total, $File::Find::name]);
};
}

find(\checkLink, $inpath);

@result = sort { $b-[0] = $a-[0]} @result;

print Dumper([EMAIL PROTECTED]);
print done reporting. (any file above size are printed above.);

__END__

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: check html file size

2005-10-04 Thread Matt Garrish

Xah Lee [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 would anyone like to translate the following perl script to Python or
 Scheme (scsh)?

Even if you weren't an incredibly offensive and petulant poster, what makes 
you think anyone would write a script from you?

Matt 


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: check html file size

2005-10-04 Thread Grant Edwards
On 2005-10-05, Xah Lee [EMAIL PROTECTED] wrote:

 would anyone like to translate the following perl script to
 Python or Scheme (scsh)?

Sure. It'll cost you $110/hour with a 2-hour minimum.  Where do
I send the invoice?

-- 
Grant Edwards   grante Yow!  I'll take ROAST BEEF
  at   if you're out of LAMB!!
   visi.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: check html file size

2005-10-04 Thread Erik Max Francis
Matt Garrish wrote:

 Even if you weren't an incredibly offensive and petulant poster, what makes 
 you think anyone would write a script from you?

Because in addition to being offensive and petulant, he's also an idiot.

-- 
Erik Max Francis  [EMAIL PROTECTED]  http://www.alcyone.com/max/
San Jose, CA, USA  37 20 N 121 53 W  AIM erikmaxfrancis
   There is no fate that cannot be surmounted by scorn.
   -- Albert Camus
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: check html file size

2005-10-04 Thread Tad McClellan
Xah Lee [EMAIL PROTECTED] wrote:

 would anyone like to translate the following perl script to Python or
 Scheme (scsh)?


Yes, I would.


-- 
Tad McClellan  SGML consulting
[EMAIL PROTECTED]   Perl programming
Fort Worth, Texas
-- 
http://mail.python.org/mailman/listinfo/python-list