There's also, of course, tika-server.
No matter the method, it is always best to isolate Tika to its own jvm, vm or m.
-Original Message-
From: Charlie Hull [mailto:char...@flax.co.uk]
Sent: Monday, April 9, 2018 4:15 PM
To: solr-user@lucene.apache.org
Subject: Re: How to use Tika
I actually used solr 5.x, the more like this features, and a subset of
human tagged data (about 10%) to apply subject coding with around a 95%
accuracy rate to over 2 million documents, so it is definitely doable
On Tue, Apr 10, 2018 at 10:40 AM, Alexandre Rafalovitch
wrote:
I know it was a joke, but I've been thinking of something like that.
Not a chatbot per say, but perhaps something that uses Machine
Learning/topic clustering on the past discussions and match them to
the new questions. Still would need to be rechecked by a human for
final response, but could be
Oh this is great! Saves me a whole bunch of manual work.
Thanks!
-Original Message-
From: Charlie Hull [mailto:char...@flax.co.uk]
Sent: Monday, April 09, 2018 2:15 PM
To: solr-user@lucene.apache.org
Subject: [EXT] Re: How to use Tika (Solr Cell) to extract content from HTML
document
As a bonus here's a Dropwizard Tika wrapper that gives you a Tika web
service https://github.com/mattflax/dropwizard-tika-server written by a
colleague of mine at Flax. Hope this is useful.
Cheers
Charlie
On 9 April 2018 at 19:26, Hanjan, Harinder
wrote:
> Thank
Thank you Charlie, Tim.
I will integrate Tika in my Java app and use SolrJ to send data to Solr.
-Original Message-
From: Allison, Timothy B. [mailto:talli...@mitre.org]
Sent: Monday, April 09, 2018 11:24 AM
To: solr-user@lucene.apache.org
Subject: [EXT] RE: How to use Tika (Solr Cell)
+1
https://lucidworks.com/2012/02/14/indexing-with-solrj/
We should add a chatbot to the list that includes Charlie's advice and the link
to Erick's blog post whenever Tika is used.
-Original Message-
From: Charlie Hull [mailto:char...@flax.co.uk]
Sent: Monday, April 9, 2018 12:44
I'd recommend you run Tika externally to Solr, which will allow you to
catch this kind of problem and prevent it bringing down your Solr
installation.
Cheers
Charlie
On 9 April 2018 at 16:59, Hanjan, Harinder
wrote:
> Hello!
>
> Solr (i.e. Tika) throws a "zip bomb"