>The first thing I did was a Web Robot, that crawls all pages for each > student and gets the necessary information. This significantly saves time,
> but again requires human interference and time. PDFs that are regularly > sent automatically by email, for each student, contain all the necessary > information, that the Web Robot collects. This thread is becoming out of the topic. You should rather switch to StackOverflow and provide detailed description of the whole process there. If properly tagged, your case can attract more people with broader knowledge in this particular topic than me. Just a note, parsing data from PDF is always harder than from database or plain text formats (XML/JSON/CSV). If any engine can export data to PDF, it can potentially export same data also to formats better suited for bulk processing. Jan

