RE: Stack Overflow Question

2014-07-01 Thread Allison, Timothy B.
Good to hear. Let us know if you have any other questions or when you run into surprises. From: yeshwanth kumar [mailto:yeshwant...@gmail.com] Sent: Tuesday, July 01, 2014 10:23 AM To: Allison, Timothy B. Subject: Re: Stack Overflow Question hi tim, i forgot to change the BodyContentHandler

RE: Stack Overflow Question

2014-07-01 Thread Allison, Timothy B.
kumar [mailto:yeshwant...@gmail.com] Sent: Tuesday, July 01, 2014 9:00 AM To: Allison, Timothy B. Subject: Re: Stack Overflow Question output is same even with ToXMLHandler On Tue, Jul 1, 2014 at 5:59 PM, Allison, Timothy B. mailto:talli...@mitre.org>> wrote: Did you try the ToXMLHandler?

RE: Stack Overflow Question

2014-07-01 Thread Allison, Timothy B.
Did you try the ToXMLHandler? From: yeshwanth kumar [mailto:yeshwant...@gmail.com] Sent: Monday, June 30, 2014 4:50 PM To: Allison, Timothy B. Subject: Re: Stack Overflow Question hi tim, i tried in all possible ways, instead of reading entire zip file i parsed individual zipentries, but even

RE: Stack Overflow Question

2014-06-30 Thread Allison, Timothy B.
Or use the ToXMLHandler and parse the XML? From: Allison, Timothy B. [mailto:talli...@mitre.org] Sent: Monday, June 30, 2014 3:55 PM To: yeshwanth kumar Cc: user@tika.apache.org Subject: RE: Stack Overflow Question Might want to look into RecursiveMetadata Parser http://wiki.apache.org/tika

RE: Stack Overflow Question

2014-06-30 Thread Allison, Timothy B.
ison, Timothy B. Subject: Re: Stack Overflow Question hi tim, thanks for quick reply, i changed the contenthandler to bodyContentHandler i got exception for maximum word limit, i used -1 in the bodycontenthandler constructor, now its another problem, filenames and content are present in st

RE: Stack Overflow Question

2014-06-30 Thread Allison, Timothy B.
DefaultHandler is effectively a NullHandler; it doesn't store or do anything. Try BodyContentHandler or ToXMLHandler or maybe WriteoutHandler. If you want to write out each embedded file as a binary, try subclassing EmbeddedResourceHandler. QUOTE: 0down votefavorite