I'm using this C# code to call the parser directly via its URL; it returns JSON:

var url = @"http://localhost:8983/solr/update/extract";;

var client = new WebClient();
client.QueryString.Add("extractOnly","true");
client.QueryString.Add("wt","json");
var data = client.UploadFile(url, "input.txt");
var json = ASCIIEncoding.ASCII.GetString(data);




Sincerely,
Alex 


-----Original Message-----
From: Nick Burch [mailto:[email protected]] 
Sent: 16 August 2012 6:36 PM
To: [email protected]
Subject: Re: Return raw text from document

On Thu, 16 Aug 2012, Alexander Cougarman wrote:
> Is it possible to return just the raw text of the document extracted 
> by Tika? In other words, we don't want it in XML or JSON, just the 
> text in it.

Yes. Are you using the TikaApp jar, calling the Tika facade class, or calling a 
parser directly?

Nick

Reply via email to