I've been working on a PDF eBook reader activity for the Internet Archive. 

The goal is to make a light weight efficient PDF file reader that can search 
and download books from the Internet Archive. 

I'm basing it on the xbook activity, but adding support for browsing the 
Internet Archive, and changing the way it renders. 
The xbook activity uses evince to render PDF, which runs in a separate process, 
and supports different formats besides PDF. Instead, I want to integrate the 
poppler PDF library into Python, and use it to write a light-weight PDF book 
reader that draws with Cairo directly. 

(See Irvin Probst's goal "2/ write a minimal set of python bindings for 
libpoppler")
http://www.mail-archive.com/[email protected]/msg00710.html

I made a Python module named "poppler" (in a project named pypoppler) that can 
read PDF files and render them by calling Cairo directly (instead of making an 
intermediate bitmap with its own renderer or running a separate process). It 
can take a pycairo context as a parameter. So now I can render PDF files into 
Cairo contexts from Python, at any scale or rotation or clipping region! 

I used SWIG to define a "poppler" Python module with wrappers around C++ 
classes for Document and Page. They wrap the corresponding poppler glib objects.

I put in the most important methods that will be useful to us now (count pages, 
get a page, measure the page, render the page through Cairo), but haven't 
fleshed it out with support for all the other things you can do with PDF files 
(annotations, fonts, index, form fields, etc).

I found some m4 macros for configuring Python extensions with SWIG, which 
helped make the configuration process easier.
The header file popplerwrappers.h is SWIG-friendly valid C++ header defining 
Document and Page classes with inline function definitions, and it gets 
processed by SWIG as well as included in the generated wrapper.

The "poppler.i" SWIG file defines some headers and initialization code, defines 
some typemaps, and loads some typemaps from "pycairo.i".
It defines some initialization code that imports the pycairo interface (so we 
can pass pycairo.Context in and get the cairo_t to draw with), and initalizes 
the glib object system.
It defines some typemaps for out parameters (so functions like size and 
getCropBox can return multiple numbers).

The "pycairo.i" SWIG library defines typemaps that let you convert between C 
cairo_t pointers and Python pycairo.Context objects (which 
could be used by other projects).

Typemaps let you tell SWIG how to pass and convert parameters in and out of 
functions (like the Cairo context).

I've found some cool Python libraries for generating charts and reports in PDF 
(PyChart and ReportLab). Is anyone considering including stuff like that in the 
standard distribution, and are there any favorites? SimCity could use PyChart 
and ReportLab to display its history graphs and statistics. Here are some ideas 
I wrote about the eBook reader and related projects that people could work on:

http://wiki.laptop.org/go/Summer_of_Code/2006

ebook reader

    Mentor: Don Hopkins 

Work with a crossmark/html book reader, or produce tools for converting to/from 
this format, to give children annotatable access to the worlds digitized books.

Don Hopkins is developing a PDF based eBook reader for the Internet Archive, 
using the "poppler" library to draw with Cairo. It will have a simple book 
reading user interface to search, page, zoom, pan, rotate, arrange pages in 
various configurations, follow links, navigate the index, etc. It should be 
fully usable in "book mode" with the game controller. It will be able to browse 
and search the Internet Archive eBook library, and download eBooks to read. It 
can use the Internet Archive RSS feeds and web services to get lists and 
descriptions of books, and search the archive, and download XML meta-data and 
PDF documents.

Other interesting eBook related projects:

Optimizing eBook activity and libraries for low power and memory consumption. 
Optimizing Cairo library image rendering. Reusing the "poppler" PDF rendering 
module for other purposes. Integrate useful PDF generation modules (i.e. 
PyGraph, ReportLab). Write some useful components and applications using PDF 
generation and rendering modules. Extending Poppler's API to support editing 
PDF documents. Developing a simple PDF editor component (for annotating eBooks 
and editing graphics).

Collaborative shared eBook reading activity: synchronize the document, page and 
a cursor over the network, so kids can take turns reading an eBook out loud 
together, with special support for plays and scripts. Each child chooses one or 
more characters to read, and the eBook parses the text to know who speaks each 
line, and prompts each child to read their lines by zooming and highlighting 
the text to read.



_______________________________________________
Evince-list mailing list
[email protected]
http://mail.gnome.org/mailman/listinfo/evince-list

Reply via email to