> The main problem I had with translating the spec onto the Zaurus were that a > number of "sizes" were in pixels. No problem for images but it was a bit of > pain for things like indents when I zoomed in/out. The other main places > that I recall that pixel sizes are used is in paragraph spacing (which I > don't really use on the Zaurus yet) and in horizontal rule size.
Yes, this has bothered me too. But I don't think it's going to change soon. One way to proceed is to think of one pixel as 1/50 of an inch (guessing as to the DPI of the original Palm). > For example, on > the Z I use bold, italic etc flags to indicate the style of the font and an > offset (which can be negative) into a size hierarchy to specify the size of > the font (eg, one size bigger than base font). This would be too Z centric > but something even more general would be better. If you look at the GTK viewer code, you'll see something similar. H6 is bold but the same size as regular text, H5 is 1.2 x regular, H4 is 1.4 times, etc. to H1, which is twice the point size of regular. > A current problem I have is figuring out how to get the URLs of the external > links out. There is a slight mismatch in terminology in the version of the > document I have where the descriptive text refers to URLs but the spec for > the paragraph types refers to links when referring to the record types - or > maybe these are different and that is why I haven't yet managed to get it > working 8^). It's tricky -- it took me a few weeks to figure it out. I'd be happy to re-write the PluckerDB document with better wording if we can figure it out. Here's how it works: As URLs are encountered in the text, and extra records are generated by the distiller (when you have to break a page into multiple parts), they are assigned ascending numbers, record-id numbers. These numbers are assigned WHETHER OR NOT an actual record with that ID is present in the document. Some record-ID numbers are phantom; that is, they are assigned, but that record (not the page) is later found to be non-existent or unnecessary. Phantom records have no URL. Some internal records which are actually in the document, but are derived or metadata, also have no URL. When it's time to write the URL data record, we treat each non-existent URL as a zero-length string, and each real URL, whether or not the page it's for was included in the document, as a string. We concatenate all the strings, using NUL characters between them. Thus there may be a run of several NUL characters at the beginning of the first URL data record, or anywhere in the record where a phantom record-ID occurred. When processing the URL data record, you basically count NUL characters to identify the URL (which may be zero-length) for a particular record-ID. Because we only put a small number of URLs in each URL data record (1-200), we have a second type of record, called the 'URL handling data record', which is just an index into the set of URL data records. So to find the URL for a particular record-id, you first look at the URL handling data record, and figure out from that which URL data record the URL is in. You then look at that data record, scanning NUL chars to find the end of URLS, till you come to the right one -- which may be zero-length. Hope this clears it up. > the unicode characters (ironic since unicode handling is > already builtin - but the all the other document decoders are pure 8-bit so > the unicode stuff comes too late in the chain). It was fairly easy to implement in the GTK viewer, since we have to process function codes anyway. Did you use the libunpluck library? It's very vanilla plain C, but it does implement the owner-id decoding. > won't I lose the "Indent first line of every > paragraph" effect from the distiller - or that accomplished in a different > way? (eg, If the distiller set the indent before the first character of the > paragraph and then immediately set it back after the first character it > would work on my reader - but I want to match what the distiller currently > does). The distiller only indents paragraph beginnings when the configuration parameter "indent_paragraphs" is True. Otherwise it puts extra spacing between them. Or you could look for the indentation string (which is "\x0a\x0a\x0a\x0a\x0a\x0a" at the beginning of a paragraph) and do the indent in the viewer. What we need are stylesheets. Bill _______________________________________________ plucker-dev mailing list [EMAIL PROTECTED] http://lists.rubberchicken.org/mailman/listinfo/plucker-dev
