I'm just using plucker for a couple of weeks now (and 1.2 for just a week or so). I 
had some questions though. Hopefully, they are as easy to answer as the one about blue 
anchors. (PS: I did RTFM, but couldn't find M :-) ).

1) Has the documentation been updated for version 1.2? (or anything after 1.1.13, 
which is the last I got).

2) I'm using windows and I've tried the different options for image_parser in the 
plucker.ini. Most of them don't work, except pil2 and windowspil.
But

2a) windowspil doesn't make a link from small pictures (the maxwidth x maxheight ones) 
to the larger versions (the alt-maxwidth x alt-maxheight ones). I've tried to figure 
out the source-code (I'm new to python, so that was fun) and I found that some image 
parser are derived from ImageParser, which seems to be the baseclass, but others aren't
class ImageParser:
class ImageMagickImageParser:
class NetPBMImageParser:
class NewNetPBMImageParser(ImageParser):
class PythonImagingLibraryParser:
class NewPythonImagingLibraryParser(ImageParser):
class WindowsImageParser:
class WindowsPILImageParser:
Note that pil2 (NewPythonImagingLibraryParser) is using ImageParser.
I also found that at the bottom of ImageParser::get_plucker_doc(), there is a piece of 
code that seems to gather the bigger picture by calling _related_images(). I also saw 
that these images are added by a call to PluckerImageDocument::add_related_image().
Then I looked at where add_related_image() is used and found two places: it's 
definition and it's use at the bottom of ImageParser::get_plucker_doc(). Now you 
probably see the relevance how I started: windowspil (WindowsPILImageParser) isn't 
derived from ImageParser AND it doesn't generate those related images.

2b) pil2 does a call to add_related_image(), but has another problem: it crashes when 
I use bpp=16.

C:\>"c:\program files\plucker\python\python" "c:\program files\plucker\pyplucker
\spider.py" "--pluckerhome=c:\program files\plucker" --home-url=file:c:/x.html -
-doc-file=c:\x --bpp=16
Pluckerdir is 'c:\program files\plucker'...
---- 0 collected, 1 to do ----
Processing file:C:/x.html...
  Retrieved ok.
  Parsed ok; 1 image.
---- 1 collected, 1 to do ----
Processing file:C:\x.jpg...
  Retrieved ok.
Error:  Unknown error parsing document file:C:\x.jpg:
Traceback (innermost last):
  File "C:\Program Files\plucker\PyPlucker\Parser.py", line 45, in generic_parse
r
    return parsed.get_plucker_doc ()
  File "C:\Program Files\plucker\PyPlucker\ImageParser.py", line 218, in get_plu
cker_doc
    raise ImageSize ("Image data too large (%d bytes) for a Plucker image record
 "
ImageSize: Image data too large (209024 bytes) for a Plucker image record (max 6
1440 bytes) when plucked at 500x209x16!  Scale it down.
  Parsed ok.
---- all 3 pages retrieved and parsed ----

Writing out collected data...
Writing document 'x' to file c:\x.pdb
Traceback (innermost last):
  File "c:\program files\plucker\pyplucker\spider.py", line 1512, in ?
    sys.exit(realmain())
  File "c:\program files\plucker\pyplucker\spider.py", line 1505, in realmain
    retval = main (config, exclusion_lists)
  File "c:\program files\plucker\pyplucker\spider.py", line 1041, in main
    mapping = writer.write (verbose=verbosity, alias_list=alias_list)
  File "C:\Program Files\plucker\PyPlucker\Writer.py", line 518, in write
    result = Writer.write (self, verbose, alias_list=alias_list)
  File "C:\Program Files\plucker\PyPlucker\Writer.py", line 310, in write
    self._mapper = Mapper(self._collection, alias_list.as_dict())
  File "C:\Program Files\plucker\PyPlucker\Writer.py", line 102, in __init__
    self._get_id_for_doc(doc)
  File "C:\Program Files\plucker\PyPlucker\Writer.py", line 112, in _get_id_for_
doc
    id = self._url_to_id_mapping.get(doc.get_url())
AttributeError: 'None' object has no attribute 'get_url'

I think _get_id_for_doc should return if doc==None, which would be a bit more graceful.

I hope anyone can help me here.

agb

Reply via email to