Re: [Dspace-tech] filter-media problem - question on size limit

2008-10-27 Thread Thornton, Susan M. (LARC-B702)[NCI INFORMATION SYSTEMS]
?? Sue Walker-Thornton NASA Langley Research Center (757) 224-4074 -Original Message- From: Graham Triggs [mailto:[EMAIL PROTECTED] Sent: Friday, October 24, 2008 3:13 PM To: dspace-tech@lists.sourceforge.net Subject: Re: [Dspace-tech] filter-media problem - question on size limit

Re: [Dspace-tech] filter-media problem - question on size limit

2008-10-27 Thread stuart yeates
PROTECTED] Sent: Friday, October 24, 2008 3:13 PM To: dspace-tech@lists.sourceforge.net Subject: Re: [Dspace-tech] filter-media problem - question on size limit If anyone has example PDFs that cause the text extraction to fail (smaller PDFs preferably!) that they are able to share, please

Re: [Dspace-tech] filter-media problem - question on size limit

2008-10-24 Thread Mark H. Wood
I found this: http://java-source.net/open-source/pdf-libraries PJX and PDF Jester look, at first glance, as though they might be worth considering. OTOH it looks like PDFBox might be getting more attention in its new home, and if so, then it makes sense to stick with it and help to improve

Re: [Dspace-tech] filter-media problem - question on size limit

2008-10-23 Thread Tim Donohue
Susan, Although you never mentioned it specifically, it sounds like you are talking about filtering of *PDF* documents. Before I get to some of the reasons for your current problems, it's worth mentioning a bit of background (you may have already heard this before, but I want to be sure).

Re: [Dspace-tech] filter-media problem - question on size limit

2008-10-23 Thread Graham Triggs
Tim Donohue wrote: (2) If you look closely at the PDFBox site, you'll notice that software has *not* had an updated release since Oct 2006. And if you keep looking at that site, you aren't likely to see much activity in the future. You should try looking here: