[jira] Updated: (PDFBOX-655) Default character width should be used if width of a character is not defined

2010-03-09 Thread Atsuo Ishimoto (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Atsuo Ishimoto updated PDFBOX-655: -- Attachment: defaultfontwidth.patch > Default character width should be used if width of a chara

[jira] Created: (PDFBOX-655) Default character width should be used if width of a character is not defined

2010-03-09 Thread Atsuo Ishimoto (JIRA)
Default character width should be used if width of a character is not defined - Key: PDFBOX-655 URL: https://issues.apache.org/jira/browse/PDFBOX-655 Project: PDFBox

[jira] Updated: (PDFBOX-654) Extracting CJK text

2010-03-09 Thread Atsuo Ishimoto (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Atsuo Ishimoto updated PDFBOX-654: -- Attachment: identity-h.patch > Extracting CJK text > --- > > Ke

[jira] Created: (PDFBOX-654) Extracting CJK text

2010-03-09 Thread Atsuo Ishimoto (JIRA)
Extracting CJK text --- Key: PDFBOX-654 URL: https://issues.apache.org/jira/browse/PDFBOX-654 Project: PDFBox Issue Type: Improvement Components: Text extraction Reporter: Atsuo Ishimoto This is an upd

Re: Reopen PDFBOX-483?

2010-03-09 Thread steve poling
I said "Linux" but I was thinking "Unix." we work in a field where nits can bite you in the butt. Martinez, Mel - 1004 - MITLL wrote: Minor correction - OSX is based on Darwin which is based on BSD, not Linux. Your basic point is not about that, so I know this is just a nit. :-)

RE: pdfbox development / documentation

2010-03-09 Thread Martinez, Mel - 1004 - MITLL
My vote: (e) all of them are peers Ultimately, it will be what the community makes it to be. -Original Message- From: Michael Müller [mailto:michael.muel...@mueller-bruehl.de] Sent: Tuesday, March 09, 2010 4:02 PM To: dev@pdfbox.apache.org Subject: pdfbox development / documentation Hi,

Re: pdfbox development / documentation

2010-03-09 Thread Daniel Wilson
Others may see it differently, but from my perspective, PDFBox is a library for evaluating existing PDF's. It includes some command-line tools that make its functionality more accessible, but those are not the core of PDFBox. It has some capabilities for editing existing PDF's, but those, afaik, ar

pdfbox development / documentation

2010-03-09 Thread Michael Müller
Hi, It seems my question / annotation to pdfbox development started something. Yes, and as I stated before I like to support this project (even I only have a small amount of time). As Andreas and other suggested, documentation will be a good starting point. To get a bird's view, please let me sta

Re: Reopen PDFBOX-483?

2010-03-09 Thread Andreas Lehmkuehler
Hi, steve poling schrieb: Andreas Lehmkuehler schrieb: If you goto PDFBOX-490 , you'll find attached file filled.pdf that manifests this error, but I've been seeing this with a lot of different PDFs: display looks good, print looks bad. I can

[jira] Updated: (PDFBOX-651) Team list should be filled out or deleted ... it confuses users now

2010-03-09 Thread JIRA
[ https://issues.apache.org/jira/browse/PDFBOX-651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler updated PDFBOX-651: -- Issue Type: Improvement (was: Bug) > Team list should be filled out or deleted ... it

[jira] Commented: (PDFBOX-651) Team list should be filled out or deleted ... it confuses users now

2010-03-09 Thread JIRA
[ https://issues.apache.org/jira/browse/PDFBOX-651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843265#action_12843265 ] Andreas Lehmkühler commented on PDFBOX-651: --- As a first step I've added a list wi

Re: pdfbox develpment

2010-03-09 Thread Andreas Lehmkuehler
Hi, Maruan Sahyoun schrieb: Hi , I started with the documentation of some tools and opened an issue in JIRA for that (PDFBOX-653). Please let me know if that workflow is OK for you or if I should use a different approach. The workflow ist quite perfect. BR Andreas Lehmkühler

[jira] Commented: (PDFBOX-7) extract information from tagged PDF

2010-03-09 Thread JIRA
[ https://issues.apache.org/jira/browse/PDFBOX-7?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843253#action_12843253 ] Andreas Lehmkühler commented on PDFBOX-7: - That one works even better and I've applie

RE: Reopen PDFBOX-483?

2010-03-09 Thread Martinez, Mel - 1004 - MITLL
Minor correction - OSX is based on Darwin which is based on BSD, not Linux. Your basic point is not about that, so I know this is just a nit. :-) -Original Message- From: steve poling [mailto:s...@i2k.com] Sent: Monday, March 08, 2010 11:57 PM To: dev@pdfbox.apache.org Subject: Re: Reope

RE: pdfbox develpment

2010-03-09 Thread Martinez, Mel - 1004 - MITLL
Michael, A lot of us 'commit' by simply opening issues in the Jira system and posting patches there. Andreas, Jukka and the other committers do a fine job of harvesting those patches as we get close to releases. You thus do not need to be a direct svn committer to participate in the developme

Re: pdfbox develpment

2010-03-09 Thread Maruan Sahyoun
Hi , I started with the documentation of some tools and opened an issue in JIRA for that (PDFBOX-653). Please let me know if that workflow is OK for you or if I should use a different approach. Kind regards Maruan Sahyoun Am 09.03.2010 um 09:37 schrieb Andreas Lehmkühler: > Hi, > > Betref

Re: Reopen PDFBOX-483?

2010-03-09 Thread Maruan Sahyoun
Hi, FYI - using PDFReader the PDF is displayed OK but when printed the same results are produced as with PrintPDF. The printed output contains the variable data only (and some lines), Boilerplate text is not printed. Maruan Sahyoun Am 09.03.2010 um 13:58 schrieb Andreas Lehmkühler: > Hi, >

[jira] Updated: (PDFBOX-653) Document the command line tools Overly, PDFMerge and PDFSplit

2010-03-09 Thread Maruan Sahyoun (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maruan Sahyoun updated PDFBOX-653: -- Attachment: tools.patch Patch to the PDFBox documentation > Document the command line tools Ov

Re: Reopen PDFBOX-483?

2010-03-09 Thread Maruan Sahyoun
Hi , please find enclosed the text extracted from the printed PDF. Extraction was done using Adobe Acrobat 8. X0X0X0 X0X0X05 X0X0X0 X0X0X05 X0X0X0 X0X

Re: Re: Reopen PDFBOX-483?

2010-03-09 Thread Andreas Lehmkühler
Hi, Betreff: Re: Reopen PDFBOX-483? Gesendet: Di, 09. Mrz 2010 Von: Maruan Sahyoun > Hi, > > please find enclosed the result of the printing test conducted on > > Windows 2003 Server SP2 32 bit, Java 1.5 using a fresh built from trunk. The > test was done using the Adobe PDF printer driver as

[jira] Created: (PDFBOX-653) Document the command line tools Overly, PDFMerge and PDFSplit

2010-03-09 Thread Maruan Sahyoun (JIRA)
Document the command line tools Overly, PDFMerge and PDFSplit - Key: PDFBOX-653 URL: https://issues.apache.org/jira/browse/PDFBOX-653 Project: PDFBox Issue Type: Improvement

[jira] Updated: (PDFBOX-653) Document the command line tools Overly, PDFMerge and PDFSplit

2010-03-09 Thread Maruan Sahyoun (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maruan Sahyoun updated PDFBOX-653: -- Description: The following patch adds some documentation for the command line tools Overlay, PD

Re: Re: Reopen PDFBOX-483?

2010-03-09 Thread Andreas Lehmkühler
Hi, Betreff: Re: Reopen PDFBOX-483? Gesendet: Di, 09. Mrz 2010 Von: Maruan Sahyoun > Hi , > > please find enclosed the text extracted from the printed PDF. Extraction was > done using Adobe Acrobat 8. > > X0X0X0 X0X0X05 > X0X0X

Re: Reopen PDFBOX-483?

2010-03-09 Thread Maruan Sahyoun
Hi Andreas, yes, the results are similar BUT most of the text and some of the lines are missing. Converting to Image output using PDFToImage provides a different and much better result where all text and lines are included and only some misplacement occurs. Is there a way to submit the attachme

Odd characters from text extraction

2010-03-09 Thread Ian Smith
Hi Folks, I have linked to a PDF (~2MB) that produces unprintable characters in the extracted text output. These characters seem to be associated with the first two pages of the document. http://www.yourphp.org.uk/media/pdf/g/4/Annual_Report_0809.pdf I believe the problem is caused by at least

[jira] Created: (PDFBOX-652) ResourceLoader returns NULL on missing Resources

2010-03-09 Thread Erik Scholtz (JIRA)
ResourceLoader returns NULL on missing Resources Key: PDFBOX-652 URL: https://issues.apache.org/jira/browse/PDFBOX-652 Project: PDFBox Issue Type: Bug Components: Utilities Affec

Re: Reopen PDFBOX-483?

2010-03-09 Thread Maruan Sahyoun
Hi, please find enclosed the result of the printing test conducted on Windows 2003 Server SP2 32 bit, Java 1.5 using a fresh built from trunk. The test was done using the Adobe PDF printer driver as well as Apple and HP Postscript printers with similar results. kind regards Maruan Sahyoun

Re: Re: Reopen PDFBOX-483?

2010-03-09 Thread Andreas Lehmkühler
Hi, Betreff: Re: Reopen PDFBOX-483? Gesendet: Di, 09. Mrz 2010 Von: Maruan Sahyoun > Hi Andreas, > > I can do a test on our Windows test server (Windows 2003, 32bit) and let you > know the results around lunch time (german time) if that helps Yeah, that would be great. BR Andreas Lehmkühler >

Re: Re: pdfbox develpment

2010-03-09 Thread Andreas Lehmkühler
Hi, Betreff: Re: pdfbox develpment Gesendet: Di, 09. Mrz 2010 Von: Maruan Sahyoun > Hi, > > we were looking to start fixing some of the open issues but can instead > develop some small tutorials for common tasks like text extraction, forms > handling and highlighting. > > WDYT Sounds good to me