Re: Derived Field Solr Schema

2019-06-21 Thread Muaawia Bin Arshad
Thank you so much! This is very helpful On 6/21/19, 12:35 PM, "Alexandre Rafalovitch" wrote: The easiest way is to do that with Update Request Processors: https://lucene.apache.org/solr/guide/7_7/update-request-processors.html Usually, you would clone a field and then do your

Re: Derived Field Solr Schema

2019-06-21 Thread Alexandre Rafalovitch
The easiest way is to do that with Update Request Processors: https://lucene.apache.org/solr/guide/7_7/update-request-processors.html Usually, you would clone a field and then do your transformations. For your specific example, you could use: *) FieldLengthUpdateProcessorFactory - int rather than

Derived Field Solr Schema

2019-06-21 Thread Muaawia Bin Arshad
Hi Everyone, I am fairly new to solr and I was wondering if there is a way in solr 7.7 to populate fields based on some pre-processing on other field. So let’s say I have a field called fieldX defined in the schema, I want to define another field called isFieldXgood which is just a Boolean

Re: Is Solr can do that ?

2019-06-21 Thread Erick Erickson
What Sam said. Here’s something to get you started on how and why it’s better to be using Tika rather than shipping the docs to Solr and having ExtractingRequestHandler do it on Solr: https://lucidworks.com/2012/02/14/indexing-with-solrj/ Best, Erick > On Jun 21, 2019, at 9:56 AM, Samuel

Re: Is Solr can do that ?

2019-06-21 Thread Shawn Heisey
On 6/21/2019 10:32 AM, Matheo Software Info wrote: My question is very simple JI would like to know if Solr can process around 30To of data (Pdf, Text, Word, etc…) ? What is the best way to index this huge data ? several servers ? several shards ? other ? Sure, Solr can do that. Whether

Re: Is Solr can do that ?

2019-06-21 Thread Samuel Kasimalla
Hi Bruno, Assuming you meant 30TB, the first step is to use TIka parser and convert the rich documents into plain text. We need the number of documents, the unofficial word on the street is about 50 million documents per shard, of course a lot of parameters are involved in this, it's a simple

Is Solr can do that ?

2019-06-21 Thread Matheo Software Info
Dear Solr User, My question is very simple J I would like to know if Solr can process around 30To of data (Pdf, Text, Word, etc…) ? What is the best way to index this huge data ? several servers ? several shards ? other ? Many thanks for your information, Cordialement, Best Regards

Re: Solr 6.5.1 SpellCheckCollator StringIndexOutOfBoundsException

2019-06-21 Thread Erick Erickson
Possibly https://issues.apache.org/jira/browse/SOLR-13360 > On Jun 21, 2019, at 4:44 AM, Gonzalo Carracedo > wrote: > > StringIndexOutOfBoundsException

Error 500 with update extract handler on Solr 7.4.0

2019-06-21 Thread julien massiera
Hi all, I recently experienced some problems with the update extract handler on a Solr 7.4.0 instance. When sending a document via multipart POST update request, if a doc parameter name contains too much chars, the POST method fails with a 500 code error and I can see the following exception

Error 500 with update extract handler on Solr 7.4.0

2019-06-21 Thread julien massiera
Hi all, I recently experienced some problems with the update extract handler on a Solr 7.4.0 instance. When sending a document via multipart POST update request, if a doc parameter name contains too much chars, the POST method fails with a 500 code error and I can see the following exception

Solr 6.5.1 SpellCheckCollator StringIndexOutOfBoundsException

2019-06-21 Thread Gonzalo Carracedo
Hello, I'm using version 6.5.1 and i'm getting the following results when running this query: q=coordinates qt=dismax spellcheck=true spellcheck.collate=true String index out of range: -6 java.lang.StringIndexOutOfBoundsException: String index out of range: -6 at

NullPointerException leads to msg=null.

2019-06-21 Thread Nic Rodgers
We've recently upgraded from Solr 6.6 to 7.7.2 and are getting NPE when attempting to index content in our development environment. The error in the log is: 019-06-20 13:18:29.609 ERROR (qtp363988129-21) [ x:govwalesd8_fb] o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: ERROR:

Problems with long named parameters with update extract handler

2019-06-21 Thread Julien
Hi all, I recently experienced some problems with the update extract handler on a Solr 7.4.0 instance. When sending a document via multipart POST update request, if a doc parameter name contains too much chars, the POST method fails and I can see the following exception in the Solr logs :