Found bug in PSDParser.java

2012-10-11 Thread Andrew Stepanov
Hi. When i parse my PSD file, i got expection: org.apache.tika.exception.TikaException: Invalid Image Resource Block Signature Found, got 3686985 0x384249 but the spec defines 943868237 I compared specification (

Re: Found bug in PSDParser.java

2012-10-11 Thread Nick Burch
On Thu, 11 Oct 2012, Andrew Stepanov wrote: org.apache.tika.exception.TikaException: Invalid Image Resource Block Signature Found, got 3686985 0x384249 but the spec defines 943868237 I compared specification ( http://www.adobe.com/devnet-apps/photoshop/fileformatashtml/PhotoshopFileFormats.htm)

[jira] [Created] (TIKA-1005) In Microsoft Office Word 2010 documents, text inside a textbox is not extracted/parsed out.

2012-10-11 Thread David A. Patterson (JIRA)
David A. Patterson created TIKA-1005: Summary: In Microsoft Office Word 2010 documents, text inside a textbox is not extracted/parsed out. Key: TIKA-1005 URL: https://issues.apache.org/jira/browse/TIKA-1005

[jira] [Commented] (TIKA-1005) In Microsoft Office Word 2010 documents, text inside a textbox is not extracted/parsed out.

2012-10-11 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13474250#comment-13474250 ] Michael McCandless commented on TIKA-1005: -- Could you attach an example showing