Re: Solr 5.0 - uniqueKey case insensitive ?

Bruno Mannina Wed, 06 May 2015 08:09:31 -0700

Yes thanks it's now for me too.

Daniel, my pn is always in uppercase and I index them always in uppercase.

the problem (solved now after all your answers, thanks) was the request,if users

requests with lowercase then solr reply no result and it was not good.

but now the problem is solved, I changed in my source file the name pnfield to id

and in my schema I use a copy field named pn and it works perfectly.

Thanks a lot !!!

Le 06/05/2015 09:44, Daniel Collins a écrit :

Ah, I remember seeing this when we first started using Solr (which was 4.0
because we needed Solr Cloud), I never got around to filing an issue for it
(oops!), but we have a note in our schema to leave the key field a normal
string (like Bruno we had tried to lowercase it which failed).
We didn't really know Solr in those days, and hadn't really thought about
it since then, but Hoss' and Erick's explanations make perfect sense now!

Since shard routing is (basically) done on hashes of the unique key, if I
have 2 documents which are the "same", but have values "HELLO" and "hello",
they might well hash to completely different shards, so the update
logistics would be horrible.

Bruno, why do you need to lowercase at all then?  You said in your example,
that your client application always supplies "pn" and it is always
uppercase, so presumably all adds/updates could be done directly on that
field (as a normal string with no lowercasing).  Where does the case
insensitivity come in, is that only for searching?  If so couldn't you add
a search field (called id), and update your app to search using that (or
make that your default search field, I guess it depends if your calling app
explicitly uses the pn field name in its searches).


On 6 May 2015 at 01:55, Erick Erickson <erickerick...@gmail.com> wrote:

Well, "working fine" may be a bit of an overstatement. That has never
been officially supported, so it "just happened" to work in 3.6.

As Chris points out, if you're using SolrCloud then this will _not_
work as routing happens early in the process, i.e. before the analysis
chain gets the token so various copies of the doc will exist on
different shards.

Best,
Erick

On Mon, May 4, 2015 at 4:19 PM, Bruno Mannina <bmann...@free.fr> wrote:

Hello Chris,

yes I confirm on my SOLR3.6 it works fine since several years, and each

doc

added with same code is updated not added.

To be more clear, I receive docs with a field name "pn" and it's the
uniqueKey, and it always in uppercase

so I must define in my schema.xml

     <field name="id" type="string" multiValued="false" indexed="true"
required="true" stored="true"/>
     <field name="pn" type="text_general" multiValued="true"

indexed="true"

stored="false"/>
...
    <uniqueKey>id</uniqueKey>
...
   <copyField source="id" dest="pn"/>

but the application that use solr already exists so it requests with pn
field not id, i cannot change that.
and in each docs I receive, there is not id field, just pn field, and  i
cannot also change that.

so there is a problem no ? I must import a id field and request a pn

field,

but I have a pn field only for import...



Le 05/05/2015 01:00, Chris Hostetter a écrit :

: On SOLR3.6, I defined a string_ci field like this:
:
: <fieldType name="string_ci" class="solr.TextField"
: sortMissingLast="true" omitNorms="true">
:     <analyzer>
:       <tokenizer class="solr.KeywordTokenizerFactory"/>
:       <filter class="solr.LowerCaseFilterFactory"/>
:     </analyzer>
:     </fieldType>
:
: <field name="pn" type="string_ci" multiValued="false" indexed="true"
: required="true" stored="true"/>


I'm really suprised that field would have worked for you (reliably) as a
uniqueKey field even in Solr 3.6.

the best practice for something like what you describe has always (going
back to Solr 1.x) been to use a copyField to create a case insensitive
copy of your uniqueKey for searching.

if, for some reason, you really want case insensitve *updates* (so a doc
with id "foo" overwrites a doc with id "FOO" then the only reliable way

to

make something like that work is to do the lowercassing in an
UpdateProcessor to ensure it happens *before* the docs are distributed

to

the correct shard, and so the correct existing doc is overwritten (even

if

you aren't using solr cloud)



-Hoss
http://www.lucidworks.com/


---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant
parce que la protection avast! Antivirus est active.
http://www.avast.com



---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com

Re: Solr 5.0 - uniqueKey case insensitive ?

Reply via email to