Updated SlashPluck

2002-12-03 Thread Jason Day
I noticed last night that SlashPluck was not including any comments with
the articles.  I've updated it to work with Slashdot's slightly changed
format.  The latest version is at
  http://jasonday.home.att.net/code/slashpluck/
Or, you can apply the attached patch.

Jason
-- 
Jason Day   jasonday at
http://jasonday.home.att.networldnet dot att dot net
 
Of course I'm paranoid, everyone is trying to kill me.
-- Weyoun-6, Star Trek: Deep Space 9

Index: slashpluck.pl
===
RCS file: /usr/local/cvsroot/slashpluck/slashpluck.pl,v
retrieving revision 1.12
diff -u -r1.12 slashpluck.pl
--- slashpluck.pl   25 Aug 2002 20:59:15 -  1.12
+++ slashpluck.pl   3 Dec 2002 16:01:01 -
@@ -57,7 +57,7 @@
 # End of configuration section
 #
 
-$VERSION = 0.21;
+$VERSION = 0.21.1;
 
 # The directory where the html files are stored.
 $directory = $pluckerdir/slashpluck;
@@ -262,7 +262,7 @@
 
 while ($handle) {
 if ($skip) {
-if (/a name=[^]*h4/i) {
+if (/a name=[^]*b/i) {
$skip = 0;
}
 elsif (/\/form/i) {



Re: [plucker-list] New slashpluck available : Troble on MSW

2002-08-24 Thread Jason Day

On Sat, Aug 24, 2002 at 05:13:00PM -0400, Edward Rayl wrote:
[...]
 I've tried various settings on my slashpluck.pl  I tried the defaults in the
 README and they did not work.  Currently it has:
 
  $pluckerdir = \Program Files\Plucker\Default.DB;
 
  # The directory where the html files are stored.
  $directory = $pluckerdir/slashpluck;
 
  # The same directory as it appears to the browser.
  $web_directory = plucker:/Default.DB/slashpluck;

This looks OK.  Personally, I put the slashpluck directory in the C:\Program
Files\Plucker directory, not in the Default.DB directory, but I see no
reason why this shouldn't work.
 
 What happens is that the html files are written to the C:\Program
 Files\Plucker\Default.DB\slashpluck directory and I can even open the
 headlines.html file with a browser and properly link to the digit.html files
 from it.  So all is well with the slashpluck generation.  The headlines.html
 file has lines like this:
 
  pa href=c:/Program
  Files/Plucker/Default.DB/slashpluck/1.htmlGoing Back To The Past of
  the Internet/a

That's strange.  It should instead read:

   pa href=plucker:/Default.DB/slashpluck/1.htmlGoing Back To The
   Past of the Internet/a

Check your linux drive.  The links on the headline page should be a
plucker: link, not a file: link.  I wonder why it's behaving differently
on Windows?

[...]
 The file(s) are sync'ed to my palm and I have entry called Slashdot.  On it
 appears ALL the headlines, but when I click on any on of them I get:
 
  Sorry the link you selected was not downloaded...
  URL: c:/Program Files/Plucker/Default.DB/slashpluck/1.html

Actually, this is what I would expect.  For some reason, the URLs on the
headline page are not being generated properly.  It's no surprise that the
viewer can't find them.

 
 I've tried setting the maxdepth to 3, the defaults in the slashpluck README, and
 and number of other settings.  Can anyone notice what I'm doing wrong here.  I'
 suspecting a problem with the space in Program Files, though my browsers have
 no problem with this.  Also all other documents sync fine in this environment.

I don't think it's a problem with the space, or with the parser; the
generated html is bad before it ever gets to the parser.  If you want to
confirm this, manually edit the headlines.html file and change all the
article linkes to look like the plucker: link above, then run
plucker-build again.

Can you send me directly the slashpluck.pl file you are using on Windows?

 
 It was s easy on Linux,

:-D

Jason
-- 
Jason Day   jasonday at
http://jasonday.home.att.networldnet dot att dot net
 
Of course I'm paranoid, everyone is trying to kill me.
-- Weyoun-6, Star Trek: Deep Space 9
___
plucker-list mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-list



Re: [plucker-list] slashpluck works again (was slashpluck.pl broken (by change to slashdot.org?))

2002-08-22 Thread Alastair Scott

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Wed 21 August 2002 11:18 pm, Jason Day wrote:

 Yes, VA seems to be having some configuration issues.  It looks like some
 of the servers answering to slashdot.org will accept HTTP/1.0 connections,
 and some will refuse HTTP/1.0 and only accept HTTP/1.1.  I just tried it,
 and got 3 out of 10 articles.

 The easiest thing for me to do is to use libwww-perl for the URL
 connections.  Of course, that will require everybody using slashpluck
 locally to have libwww-perl installed too.  Does that seem reasonable?

Whatever works :)

Alastair
- -- 
Alastair Scott (London, United Kingdom)
http://www.unmetered.org.uk/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE9ZJL4dasIDb/2nMwRAj17AJ0ZxecvLefhSuOlw4h6h33rlaqegACfY0Z2
PVFg6Ut64P0lEe8y7Y1bPKI=
=rS+r
-END PGP SIGNATURE-

___
plucker-list mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-list



[plucker-list] New slashpluck available

2002-08-22 Thread Jason Day

I've updated slashpluck to use HTTP/1.1 connections.  Several tests now have
fetched 10 out of 10 articles.  It also requires libwww-perl, more details
are in the README.

http://jasonday.home.att.net/code/slashpluck/slashpluck.html

Any bugs, problems, feature requests, please let me know.

Jason
-- 
Jason Day   jasonday at
http://jasonday.home.att.networldnet dot att dot net
 
Of course I'm paranoid, everyone is trying to kill me.
-- Weyoun-6, Star Trek: Deep Space 9
___
plucker-list mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-list



Re: [plucker-list] slashpluck works again (was slashpluck.pl broken (by change to slashdot.org?))

2002-08-21 Thread Alastair Scott

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Wed 21 August 2002 3:16 am, Jason Day wrote:

 Looks like it was just a server configuration issue with slashdot.  For a
 while, they weren't accepting HTTP 1.0 connections, only HTTP 1.1.  But it
 seems to be fixed now, and slashpluck seems to be working fine again.

It appears to be broken again; 2 hours ago I got 3 out of 10 pages complete 
and, just now, 0 out of 10 pages complete :(

Alastair
- -- 
Alastair Scott (London, United Kingdom)
http://www.unmetered.org.uk/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE9Y+nKdasIDb/2nMwRAjBuAKCBtdcUZxgAE6NCVChXQ5+BAPhyZgCeN2FA
FhF5YNhxdlg54YHFt7oyu08=
=LxpQ
-END PGP SIGNATURE-

___
plucker-list mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-list



Re: [plucker-list] slashpluck works again (was slashpluck.pl broken (by change to slashdot.org?))

2002-08-21 Thread Jason Day

On Wed, Aug 21, 2002 at 08:27:56PM +0100, Alastair Scott wrote:
 It appears to be broken again; 2 hours ago I got 3 out of 10 pages complete 
 and, just now, 0 out of 10 pages complete :(

Yes, VA seems to be having some configuration issues.  It looks like some of
the servers answering to slashdot.org will accept HTTP/1.0 connections, and
some will refuse HTTP/1.0 and only accept HTTP/1.1.  I just tried it, and
got 3 out of 10 articles.

The easiest thing for me to do is to use libwww-perl for the URL
connections.  Of course, that will require everybody using slashpluck
locally to have libwww-perl installed too.  Does that seem reasonable?

Jason
-- 
Jason Day   jasonday at
http://jasonday.home.att.networldnet dot att dot net
 
Of course I'm paranoid, everyone is trying to kill me.
-- Weyoun-6, Star Trek: Deep Space 9
___
plucker-list mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-list



Re: [plucker-list] slashpluck broken

2002-08-20 Thread Wesley Mason

Yes, I am having the same problem with my slashpluck as well.  (I run my own)

--Wes

Alastair Scott said:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Overnight the excellent slashpluck script

 http://jasonday.home.att.net/code/slashpluck/slashpluck.html

 seems to have become broken; although the slashdot title page is OK the

 individual stories (~/.plucker/slashpluck/1.html to 10.html) are coming
 out  nearly blank (length 5 bytes, with the only text being a single
 horizontal  rule 'hr'). As the site is up I presume the slashdot.org
 page format has  changed slightly ...

 Can anyone else confirm?

 Alastair
 - --
 Alastair Scott (London, United Kingdom)
 http://www.unmetered.org.uk/
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.0.7 (GNU/Linux)

 iD8DBQE9Yim5dasIDb/2nMwRAr3eAJ4ppF2NUenpewyCyAXJPMCVrNuejwCdF63t
 anmglFJRyVAgk+L+EsAbi3g=
 =Cg+5
 -END PGP SIGNATURE-

 ___
 plucker-list mailing list
 [EMAIL PROTECTED]
 http://lists.rubberchicken.org/mailman/listinfo/plucker-list



___
plucker-list mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-list



[plucker-list] slashpluck works again (was slashpluck.pl broken (by change to slashdot.org?))

2002-08-20 Thread Jason Day

Looks like it was just a server configuration issue with slashdot.  For a
while, they weren't accepting HTTP 1.0 connections, only HTTP 1.1.  But it
seems to be fixed now, and slashpluck seems to be working fine again.
-- 
Jason Day   jasonday at
http://jasonday.home.att.networldnet dot att dot net
 
Of course I'm paranoid, everyone is trying to kill me.
-- Weyoun-6, Star Trek: Deep Space 9
___
plucker-list mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-list



SlashPluck

2002-04-27 Thread Jason Day

Hi,

I've written a perl script that fetches Slashdot headlines, articles, and
comments and formats them for plucker.  It's available on my site here:
  http://jasonday.home.att.net/code/slashpluck/
  
Feedback is welcome.

Regards,
Jason Day
-- 
Jason Day   jasonday at
http://jasonday.home.att.networldnet dot att dot net
 
Of course I'm paranoid, everyone is trying to kill me.
-- Weyoun-6, Star Trek: Deep Space 9



Re: SlashPluck

2002-04-27 Thread David A. Desrosiers

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


 I've written a perl script that fetches Slashdot headlines, articles, and
 comments and formats them for plucker.

Awesome work, Jason!!

I've just incorporated it into the Hourly Plucks  section on the
Plucker server. You can grab the converted SlashPluck.pdb file here:

http://www.plkr.org/samples/plucks/SlashPluck.pdb



d.


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE8yw/FkRQERnB1rkoRArFtAJ9qgJnJw4fu/Pwyd3FK9FbY7WAjlACglUt9
P7iqLIYjh/LhH1CFhorsCCM=
=azft
-END PGP SIGNATURE-