Hi Jookia, if you want us to include your patch (and it is welcome of course), you have to sign a copyright assignment.
Please email the following information to [email protected] with a CC to [email protected], [email protected] and [email protected], and we will send you the assignment form for your past and future changes. Please use your full legal name (in ASCII characters) as the subject line of the message. ---------------------------------------------------------------------- REQUEST: SEND FORM FOR PAST AND FUTURE CHANGES [What is the name of the program or package you're contributing to?] [Did you copy any files or text written by someone else in these changes? Even if that material is free software, we need to know about it.] [Do you have an employer who might have a basis to claim to own your changes? Do you attend a school which might make such a claim?] [For the copyright registration, what country are you a citizen of?] [What year were you born?] [Please write your email address here.] [Please write your postal address here.] [Which files have you changed so far, and which new files have you written so far?] Am Donnerstag, 7. Mai 2015, 15:58:53 schrieb Jookia: > Follow-up Comment #5, bug #20398 (project wget): > > I've found myself in need of this feature. I'm trying to download a website > recursively without pulling in every single ad and its HTML. I'd like to be > able to find out which URLs were rejected, why, and information about the > domains (host, port, etc.) > > I've patched my copy of Wget to dump all of this in to a CSV file which I > can then tool through to get my desired results: > > > > % grep "DOMAIN" rejected.csv | head -1 > DOMAIN,http://c0059637.cdn1.cloudfiles.rackspacecloud.com/flowplayer-3.2.6.m > in.js,SCHEME_HTTP,c0059637.cdn1.cloudfiles.rackspacecloud.com,80,flowplayer- > 3.2.6.min.js,(null),(null),(null),http://redated/,SCHEME_HTTP,redacted,80,,( > null),(null),(null) % grep "DOMAIN" rejected.csv | cut -d"," -f4 | sort | > uniq > 0.gravatar.com > 1.gravatar.com > c0059637.cdn1.cloudfiles.rackspacecloud.com > lh3.googleusercontent.com > lh4.googleusercontent.com > lh5.googleusercontent.com > lh6.googleusercontent.com > > > I've included a patch made in a few hours that does this. > > (file #33955) > _______________________________________________________ > > Additional Item Attachment: > > File name: 0001-rejected-log-Add-option-to-dump-URL-rejections-to-a-.patch > Size:14 KB > > > _______________________________________________________ > > Reply to this item at: > > <http://savannah.gnu.org/bugs/?20398> > > _______________________________________________ > Message sent via/by Savannah > http://savannah.gnu.org/
signature.asc
Description: This is a digitally signed message part.
