Re: how to input .txt or .html to the server in Solrj

Darius Miliauskas Thu, 26 Sep 2013 08:15:23 -0700

Thanks Shawn,

so, as far as I understood the "fieldname" is the title of my .txt file,
and "fieldValue" is the entire text parsed to a very long string, isn't it?


I guess I need just to parse the content of .txt file to the string: a)
Apache Tika is one of the choice recommended online, b) reading the file
line by line. Which one?. But I wonder how would it help me later to
analyze word-by-word or get back the text in the later searches.

The field schema is field like this <field name="name" type="text"
indexed="true" stored="true"/>. So, the text is indexed and stored by
default with "doc.addField("fieldname", "fieldValue");"?


Thanks,

Darius


2013/9/26 Shawn Heisey <[email protected]>

> On 9/26/2013 5:33 AM, Darius Miliauskas wrote:
> > Dear All,
> >
> > I am trying to use Solr (actually Solrj) to make a simple app which will
> > give me the recommended texts according to the similarity to the history
> of
> > reading other texts. Firstly, I need to input these texts to the server.
> > Let's say I have 1000 .txt files in one folder or 1000 html articles
> > online. How can I input these texts into server with Java? What words
> > should I use instead of the question marks in .addField(? ?)? It would be
> > awesome if somebody would give me any samples of code in Java.
>
> Here's a very simplistic example of how to add documents with SolrJ.
> The "server" variable is a SolrServer object that you must define.
> Usually it will be either HttpSolrServer or CloudSolrServer.
>
> Collection<SolrInputDocument> docs = new ArrayList<SolrInputDocument>();
> SolrInputDocument doc = new SolrInputDocument();
> doc.addField("fieldname", "fieldValue");
> docs.add(doc);
> server.add(docs);
>
> Thanks,
> Shawn
>
>

Re: how to input .txt or .html to the server in Solrj

Reply via email to