Hello,
I have spent a few hours trying to understand why I get this error.
RunUpdateProcessor has received an AddUpdateCommand containing a document that
appears to still contain Atomic document update operations, most likely because
DistributedUpdateProcessorFactory was explicitly disabled
One last question.
I have everything running as it should finally. However, when I pull out of
testing to do the entire directory it's just cycling through. The directory
is full of folders that have the documents in them. Do I need an html or
other file sitting in there randomly to get it to
Thanks! that fixed it.
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
There’s the collections API command SPLITSHARD.
But is this for functional reasons or are you just experimenting for background
information? 2G is a tiny index by recent standards, I routinely see 200G
indexes on a replica.
And merging indexes in SolrCloud is a bit tricky, you have to be sure
Assuming here that you’re using DIH or extrracting request handler.
There are quite a number of reasons to run Tika outside Solr so you
can handle exceptional cases as you see fit, see:
https://lucidworks.com/2012/02/14/indexing-with-solrj/
Best,
Erick
> On Mar 14, 2019, at 7:39 PM, wclarke
// Adding the dev DL, as this may be a bug
Solr v7.7.0
I’m expecting the 401 on all the servers in all 3 clusters using the security
configuration.
For example, when I access the core or collection APIs without authentication,
it should return a 401.
On one of the servers, in one of the
Haha, looks like Jörn just answered this... onError="skip|continue"
>greatly preferable if the indexing process could ignore exceptions
Please, no. I'm 100% behind the sentiment that DIH should gracefully
handle Tika exceptions, but the better option is to log the
exceptions, store the
In the Tika entity processor use the option onError=“skip”
Alternatives are abort (default) or continue (behave as nothing would have
happened)
Skip skips the current document
> Am 15.03.2019 um 12:44 schrieb Demian Katz :
>
> Jörn (and anyone else with more experience with this than I
Jörn (and anyone else with more experience with this than I have),
I've been working on Whitney with this issue. It is a PDF file, and it can be
opened successfully in a PDF reader. Interestingly, if I try to extract data
from it on the command line, Tika version 1.3 throws a lot of warnings
I am experimenting solr cloud 5.2.1 version. I have 3 shards and I use
IndexMergeTool to merge my index in a single directory. Now I have created a
new collection where I have 4 shards and I want to split my index in these 4
shards.
Is there any IndexSplit tool available or how to do it.
--
Do you have an exception?
It could be that the pdf is broken - can you open it on your computer with a
pdfreader?
If the exception is related to Tika and pdf then file an issue with the pdfbox
project. If there is an issue with Tika and MsOffice documents then Apache poi
is the right project
Thank you so much. You helped a great deal. I am running into one last
issue where the Tika DIH is stopping at a specific language and fails there
(Malayalam). Do you know of a work around?
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
I am getting an error that stops Tika fetching/processing/and committing when
it reaches a specific language (Malayalam). Is there a work around?
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
13 matches
Mail list logo