Solr or SolrJ Atomic Update

2019-03-15 Thread THIERRY BOUCHENY
Hello, I have spent a few hours trying to understand why I get this error. RunUpdateProcessor has received an AddUpdateCommand containing a document that appears to still contain Atomic document update operations, most likely because DistributedUpdateProcessorFactory was explicitly disabled

Re: Help with a DIH config file

2019-03-15 Thread wclarke
One last question. I have everything running as it should finally. However, when I pull out of testing to do the entire directory it's just cycling through. The directory is full of folders that have the documents in them. Do I need an html or other file sitting in there randomly to get it to

Re: Help with a DIH config file

2019-03-15 Thread wclarke
Thanks! that fixed it. -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: How to split a merged index which is more than 2GB in size into same size multiple shard

2019-03-15 Thread Erick Erickson
There’s the collections API command SPLITSHARD. But is this for functional reasons or are you just experimenting for background information? 2G is a tiny index by recent standards, I routinely see 200G indexes on a replica. And merging indexes in SolrCloud is a bit tricky, you have to be sure

Re: Tika Error work around?

2019-03-15 Thread Erick Erickson
Assuming here that you’re using DIH or extrracting request handler. There are quite a number of reasons to run Tika outside Solr so you can handle exceptional cases as you see fit, see: https://lucidworks.com/2012/02/14/indexing-with-solrj/ Best, Erick > On Mar 14, 2019, at 7:39 PM, wclarke

Re: Re: Authorization fails but api still renders

2019-03-15 Thread Branham, Jeremy (Experis)
// Adding the dev DL, as this may be a bug Solr v7.7.0 I’m expecting the 401 on all the servers in all 3 clusters using the security configuration. For example, when I access the core or collection APIs without authentication, it should return a 401. On one of the servers, in one of the

Re: Help with a DIH config file

2019-03-15 Thread Tim Allison
Haha, looks like Jörn just answered this... onError="skip|continue" >greatly preferable if the indexing process could ignore exceptions Please, no. I'm 100% behind the sentiment that DIH should gracefully handle Tika exceptions, but the better option is to log the exceptions, store the

Re: Help with a DIH config file

2019-03-15 Thread Jörn Franke
In the Tika entity processor use the option onError=“skip” Alternatives are abort (default) or continue (behave as nothing would have happened) Skip skips the current document > Am 15.03.2019 um 12:44 schrieb Demian Katz : > > Jörn (and anyone else with more experience with this than I

RE: Help with a DIH config file

2019-03-15 Thread Demian Katz
Jörn (and anyone else with more experience with this than I have), I've been working on Whitney with this issue. It is a PDF file, and it can be opened successfully in a PDF reader. Interestingly, if I try to extract data from it on the command line, Tika version 1.3 throws a lot of warnings

How to split a merged index which is more than 2GB in size into same size multiple shard

2019-03-15 Thread arghya.it87
I am experimenting solr cloud 5.2.1 version. I have 3 shards and I use IndexMergeTool to merge my index in a single directory. Now I have created a new collection where I have 4 shards and I want to split my index in these 4 shards. Is there any IndexSplit tool available or how to do it. --

Re: Help with a DIH config file

2019-03-15 Thread Jörn Franke
Do you have an exception? It could be that the pdf is broken - can you open it on your computer with a pdfreader? If the exception is related to Tika and pdf then file an issue with the pdfbox project. If there is an issue with Tika and MsOffice documents then Apache poi is the right project

Re: Help with a DIH config file

2019-03-15 Thread wclarke
Thank you so much. You helped a great deal. I am running into one last issue where the Tika DIH is stopping at a specific language and fails there (Malayalam). Do you know of a work around? -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Tika Error work around?

2019-03-15 Thread wclarke
I am getting an error that stops Tika fetching/processing/and committing when it reaches a specific language (Malayalam). Is there a work around? -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html