Updated SlashPluck
I noticed last night that SlashPluck was not including any comments with the articles. I've updated it to work with Slashdot's slightly changed format. The latest version is at http://jasonday.home.att.net/code/slashpluck/ Or, you can apply the attached patch. Jason -- Jason Day jasonday at http://jasonday.home.att.networldnet dot att dot net Of course I'm paranoid, everyone is trying to kill me. -- Weyoun-6, Star Trek: Deep Space 9 Index: slashpluck.pl === RCS file: /usr/local/cvsroot/slashpluck/slashpluck.pl,v retrieving revision 1.12 diff -u -r1.12 slashpluck.pl --- slashpluck.pl 25 Aug 2002 20:59:15 - 1.12 +++ slashpluck.pl 3 Dec 2002 16:01:01 - @@ -57,7 +57,7 @@ # End of configuration section # -$VERSION = 0.21; +$VERSION = 0.21.1; # The directory where the html files are stored. $directory = $pluckerdir/slashpluck; @@ -262,7 +262,7 @@ while ($handle) { if ($skip) { -if (/a name=[^]*h4/i) { +if (/a name=[^]*b/i) { $skip = 0; } elsif (/\/form/i) {
Re: [plucker-list] New slashpluck available : Troble on MSW
On Sat, Aug 24, 2002 at 05:13:00PM -0400, Edward Rayl wrote: [...] I've tried various settings on my slashpluck.pl I tried the defaults in the README and they did not work. Currently it has: $pluckerdir = \Program Files\Plucker\Default.DB; # The directory where the html files are stored. $directory = $pluckerdir/slashpluck; # The same directory as it appears to the browser. $web_directory = plucker:/Default.DB/slashpluck; This looks OK. Personally, I put the slashpluck directory in the C:\Program Files\Plucker directory, not in the Default.DB directory, but I see no reason why this shouldn't work. What happens is that the html files are written to the C:\Program Files\Plucker\Default.DB\slashpluck directory and I can even open the headlines.html file with a browser and properly link to the digit.html files from it. So all is well with the slashpluck generation. The headlines.html file has lines like this: pa href=c:/Program Files/Plucker/Default.DB/slashpluck/1.htmlGoing Back To The Past of the Internet/a That's strange. It should instead read: pa href=plucker:/Default.DB/slashpluck/1.htmlGoing Back To The Past of the Internet/a Check your linux drive. The links on the headline page should be a plucker: link, not a file: link. I wonder why it's behaving differently on Windows? [...] The file(s) are sync'ed to my palm and I have entry called Slashdot. On it appears ALL the headlines, but when I click on any on of them I get: Sorry the link you selected was not downloaded... URL: c:/Program Files/Plucker/Default.DB/slashpluck/1.html Actually, this is what I would expect. For some reason, the URLs on the headline page are not being generated properly. It's no surprise that the viewer can't find them. I've tried setting the maxdepth to 3, the defaults in the slashpluck README, and and number of other settings. Can anyone notice what I'm doing wrong here. I' suspecting a problem with the space in Program Files, though my browsers have no problem with this. Also all other documents sync fine in this environment. I don't think it's a problem with the space, or with the parser; the generated html is bad before it ever gets to the parser. If you want to confirm this, manually edit the headlines.html file and change all the article linkes to look like the plucker: link above, then run plucker-build again. Can you send me directly the slashpluck.pl file you are using on Windows? It was s easy on Linux, :-D Jason -- Jason Day jasonday at http://jasonday.home.att.networldnet dot att dot net Of course I'm paranoid, everyone is trying to kill me. -- Weyoun-6, Star Trek: Deep Space 9 ___ plucker-list mailing list [EMAIL PROTECTED] http://lists.rubberchicken.org/mailman/listinfo/plucker-list
Re: [plucker-list] slashpluck works again (was slashpluck.pl broken (by change to slashdot.org?))
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Wed 21 August 2002 11:18 pm, Jason Day wrote: Yes, VA seems to be having some configuration issues. It looks like some of the servers answering to slashdot.org will accept HTTP/1.0 connections, and some will refuse HTTP/1.0 and only accept HTTP/1.1. I just tried it, and got 3 out of 10 articles. The easiest thing for me to do is to use libwww-perl for the URL connections. Of course, that will require everybody using slashpluck locally to have libwww-perl installed too. Does that seem reasonable? Whatever works :) Alastair - -- Alastair Scott (London, United Kingdom) http://www.unmetered.org.uk/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.0.7 (GNU/Linux) iD8DBQE9ZJL4dasIDb/2nMwRAj17AJ0ZxecvLefhSuOlw4h6h33rlaqegACfY0Z2 PVFg6Ut64P0lEe8y7Y1bPKI= =rS+r -END PGP SIGNATURE- ___ plucker-list mailing list [EMAIL PROTECTED] http://lists.rubberchicken.org/mailman/listinfo/plucker-list
[plucker-list] New slashpluck available
I've updated slashpluck to use HTTP/1.1 connections. Several tests now have fetched 10 out of 10 articles. It also requires libwww-perl, more details are in the README. http://jasonday.home.att.net/code/slashpluck/slashpluck.html Any bugs, problems, feature requests, please let me know. Jason -- Jason Day jasonday at http://jasonday.home.att.networldnet dot att dot net Of course I'm paranoid, everyone is trying to kill me. -- Weyoun-6, Star Trek: Deep Space 9 ___ plucker-list mailing list [EMAIL PROTECTED] http://lists.rubberchicken.org/mailman/listinfo/plucker-list
Re: [plucker-list] slashpluck works again (was slashpluck.pl broken (by change to slashdot.org?))
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Wed 21 August 2002 3:16 am, Jason Day wrote: Looks like it was just a server configuration issue with slashdot. For a while, they weren't accepting HTTP 1.0 connections, only HTTP 1.1. But it seems to be fixed now, and slashpluck seems to be working fine again. It appears to be broken again; 2 hours ago I got 3 out of 10 pages complete and, just now, 0 out of 10 pages complete :( Alastair - -- Alastair Scott (London, United Kingdom) http://www.unmetered.org.uk/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.0.7 (GNU/Linux) iD8DBQE9Y+nKdasIDb/2nMwRAjBuAKCBtdcUZxgAE6NCVChXQ5+BAPhyZgCeN2FA FhF5YNhxdlg54YHFt7oyu08= =LxpQ -END PGP SIGNATURE- ___ plucker-list mailing list [EMAIL PROTECTED] http://lists.rubberchicken.org/mailman/listinfo/plucker-list
Re: [plucker-list] slashpluck works again (was slashpluck.pl broken (by change to slashdot.org?))
On Wed, Aug 21, 2002 at 08:27:56PM +0100, Alastair Scott wrote: It appears to be broken again; 2 hours ago I got 3 out of 10 pages complete and, just now, 0 out of 10 pages complete :( Yes, VA seems to be having some configuration issues. It looks like some of the servers answering to slashdot.org will accept HTTP/1.0 connections, and some will refuse HTTP/1.0 and only accept HTTP/1.1. I just tried it, and got 3 out of 10 articles. The easiest thing for me to do is to use libwww-perl for the URL connections. Of course, that will require everybody using slashpluck locally to have libwww-perl installed too. Does that seem reasonable? Jason -- Jason Day jasonday at http://jasonday.home.att.networldnet dot att dot net Of course I'm paranoid, everyone is trying to kill me. -- Weyoun-6, Star Trek: Deep Space 9 ___ plucker-list mailing list [EMAIL PROTECTED] http://lists.rubberchicken.org/mailman/listinfo/plucker-list
Re: [plucker-list] slashpluck broken
Yes, I am having the same problem with my slashpluck as well. (I run my own) --Wes Alastair Scott said: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Overnight the excellent slashpluck script http://jasonday.home.att.net/code/slashpluck/slashpluck.html seems to have become broken; although the slashdot title page is OK the individual stories (~/.plucker/slashpluck/1.html to 10.html) are coming out nearly blank (length 5 bytes, with the only text being a single horizontal rule 'hr'). As the site is up I presume the slashdot.org page format has changed slightly ... Can anyone else confirm? Alastair - -- Alastair Scott (London, United Kingdom) http://www.unmetered.org.uk/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.0.7 (GNU/Linux) iD8DBQE9Yim5dasIDb/2nMwRAr3eAJ4ppF2NUenpewyCyAXJPMCVrNuejwCdF63t anmglFJRyVAgk+L+EsAbi3g= =Cg+5 -END PGP SIGNATURE- ___ plucker-list mailing list [EMAIL PROTECTED] http://lists.rubberchicken.org/mailman/listinfo/plucker-list ___ plucker-list mailing list [EMAIL PROTECTED] http://lists.rubberchicken.org/mailman/listinfo/plucker-list
[plucker-list] slashpluck works again (was slashpluck.pl broken (by change to slashdot.org?))
Looks like it was just a server configuration issue with slashdot. For a while, they weren't accepting HTTP 1.0 connections, only HTTP 1.1. But it seems to be fixed now, and slashpluck seems to be working fine again. -- Jason Day jasonday at http://jasonday.home.att.networldnet dot att dot net Of course I'm paranoid, everyone is trying to kill me. -- Weyoun-6, Star Trek: Deep Space 9 ___ plucker-list mailing list [EMAIL PROTECTED] http://lists.rubberchicken.org/mailman/listinfo/plucker-list
SlashPluck
Hi, I've written a perl script that fetches Slashdot headlines, articles, and comments and formats them for plucker. It's available on my site here: http://jasonday.home.att.net/code/slashpluck/ Feedback is welcome. Regards, Jason Day -- Jason Day jasonday at http://jasonday.home.att.networldnet dot att dot net Of course I'm paranoid, everyone is trying to kill me. -- Weyoun-6, Star Trek: Deep Space 9
Re: SlashPluck
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I've written a perl script that fetches Slashdot headlines, articles, and comments and formats them for plucker. Awesome work, Jason!! I've just incorporated it into the Hourly Plucks section on the Plucker server. You can grab the converted SlashPluck.pdb file here: http://www.plkr.org/samples/plucks/SlashPluck.pdb d. -BEGIN PGP SIGNATURE- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE8yw/FkRQERnB1rkoRArFtAJ9qgJnJw4fu/Pwyd3FK9FbY7WAjlACglUt9 P7iqLIYjh/LhH1CFhorsCCM= =azft -END PGP SIGNATURE-