Welcome to PDFBox, ASF and the open source community as a whole. The quickest way to get acquainted to the software is to dig into bug reports which sound interesting and start reading the PDF specification ISO32000-1:2008 (which you can get for free from Adobe[1]). It's a rather large document, but I can attest that you don't need to read all 756 pages to understand enough to help. I'd suggest starting with sections 7.3 and 7.5 as they define what objects look like and what the overall file looks like. Where you go from there depends on what specific issue you're looking into.
The best way to report bugs would be through Jira, opening a separate bug report for each individual item (and you can link them, if fixing one thing depends on another thing being fixed, or if two bugs are related in some other way). If possible, please include a sample PDF and a few lines of code so the issue can be easily demonstrated. Including JUnit tests along with patches is also encouraged to make sure that we find out if the same bug reappears down the road. The website is not as good as it could, or perhaps should be. I remember feeling a bit lost when I first joined up having no experience with contributing code to an open source project before nor digging into the internals of the world of PDFs. Luckily the people on the mailing list were very helpful in pointing me to examples (speaking of which you should check out the examples which, if I recall correctly are in org.apache.pdfbox.examples (or something similar)). I originally signed up because my employer wanted to use PDFBox but when encrypting files with indexed bitmaps of PNGs, the decrypted version was messed up (not quite inverted, but the color palette was all wacky). Others had seen this, but nobody had looked into it, so I started digging and eventually found the issue and resolved it. My employer was then able to adopt PDFBox as a solution, ditch iText, and add a lot of cool features. I then went on to make some small enhancements to bookmarks and made some small changes to the parser to handle non-conforming documents. Had I not seen people on the mailing list answering questions and helping people on a daily basis, I probably would have given up and just went with another library. My point is, while the documentation isn't always great, the community is, and I'm rather thankful for that. [1] http://www.adobe.com/devnet/pdf/pdf_reference.html On Wed, Feb 29, 2012 at 3:40 AM, George Kalpakas <[email protected]> wrote: > Hello everybody, > > I am all new to the idea of Open Source Development let alone ASF. It just > seems a very nice idea to get involved in the development of an ASF project > (and PDFBox is my chosen one), but I am not so sure what is the best way > (or in fact any way) to get started. > > So, I would much appreciate it, if someone who knows his way around would > point me to some direction or give me some tips/guidelines on how to > proceed with getting involved. > I.e. in order to get more familiar with the project, I tried to look into > issue PDFBOX-1228. Along the way I stumbled upon some (minor) things I > believe could be improved: > Typos or broken links or other errors in the JavaDocs or even a > permission-bit being set the wrong way (according to my understanding) > etc. I wonder what is the best way for me to report those things (users/dev > mailing list, Jira, one e-mail or a separate for each issue etc) ? > > One last thing: > While poking around ASF-projects in order to pick one for me to start > getting involved, I noticed that other projects had "Get involved" > sections on their website, with clear instructions and suggestions on the > various ways one can get involved (for example Derby and Hadoop), which I > found very useful. PDFBox is lacking such a sections (apart from a little > remark on the "Mailing Lists" page: "If you like to participate in the > development of Apache PDFBox, the Developer Mailing List is the place to > be." - one has to try in order to find it and it is not very informative > either). > Considering the fact, I got really close to choosing another project, > because of me not being able to find some "Get involved"-directions and I > finally tried to apply the other projects' suggestions to PDFBox, I believe > it would be nice to incorporate such a section (it doesn't have to be on > the front-page, just somewhere anyone interested could find fairly easy). > > Thanks for all the responses (if any (-;) ! > > GK
