Thank you, Jukka,
I upgraded to the latest Tika, and I get good type:
Content-Length: 486
Content-Type: text/calendar
resourceName: SCL%200101_1.ics
However, on trying to extract the text, I get nothing
InputStream is = new FileInputStream(f);
Parser parser = new AutoDetectParser();
ContentHandler handler = new BodyContentHandler();
boolean nekoWorked = true;
try {
ParseContext parseContext = new ParseContext(); // nothing
passed yet
parser.parse(is, handler, metadata, parseContext);
text = handler.toString();
} catch (TikaException e) {
Of course, since I already got the text/calendar type, I can do it
myself....should I?
Thank you,
Mark
On Thu, Aug 5, 2010 at 2:54 AM, Jukka Zitting <[email protected]>wrote:
> Hi,
>
> On Thu, Aug 5, 2010 at 1:00 AM, Mark Kerzner <[email protected]>
> wrote:
> > I am trying to parse ICS (MS Contact) files with Tika, and I get no text.
> > Should it work at all, or should I just parse text, because that is what
> > essentially the ICS files are?
>
> All text-based formats should fall back to plain text parsing unless a
> more specific parser is available. What's the content type you get
> from:
>
> java -jar tika-app-0.7.jar --metadata document.ics
>
> BR,
>
> Jukka Zitting
>