> PS I hope I'm not in violation of the new mailing list standards, i.e. I
> hope this doesn't come through as html mail.
Nope, just text.
> but now I'll wait for your update.
Here's where I'm at now. For 1.2 I'll put together a library (jar file)
with a class called org.plkr.distiller.InvokePluckerBuildFromJava,
which implements the following Java interface:
package org.plkr.distiller;
public interface InvokePluckerBuildFromJavaIfc {
int invoke (java.io.OutputStream optional_output_channel,
java.lang.String[] arguments);
};
That should give us all the basic functionality we need. I'd hope
that someone else would write an additional Java wrapper that would
present a more Java-ish API, and then just call the "invoke" method
properly.
The library can also be used as an app with the command
java -jar <jarfile> <plucker-build-arguments>
If you want to pass in the home document instead of working from a
file, you can already do this by using the "data:" URL scheme for the
home URL -- I find that the Python urllib already supports this. See
RFC 2397 for details on how it works.
> The only thing I would suggest adding is a callback function that updates
> the caller with info on where the spider is at, i.e. total pages plucked,
> number of pages in queue. Maybe something like
> public interface SpiderListener {
> void update(int numberPagesPlucked, int numberPagesInQueue);
> // maybe even something like?
> void startPage(String currentURL);
> }
>
> Then add this to your interface
> void addSpiderCallback(SpiderListener listener)
This is a great idea, I thought of it as well. However, I believe
that this can already be supported by using the existing
"--status-file" argument and some clever programming (by a more
Java-aware type). If not, we'll fix it!
Bill
_______________________________________________
plucker-dev mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev