1. Generally, any schema change requires a full reindex. Sure, a lot of times you can squeak by, but with Solr and Lucene there are no guarantees. If it works for you, great. If not, don't complain - just reindex. And even if it does work for the current release, there is no guarantee that a similar change in a future release might not require a reindex.

2. Make up you mind whether a field is a number or a string, and stick with that import format.

General rule: clean up your data before you send it to Solr. But... you can do some amount of cleanup using update processors, including white space trimming and limited regex editing. You can also develop custom update processors, as well as write in scripting languages such as JavaScript. For example, you could parse a string of numbers and then send them to other fields.

3. Too hard to say from the way you have described it. Show us some sample input.

In general, TextField is for text, not numbers. If you intend to query data as numbers, don't use Text field.

-- Jack Krupansky

-----Original Message----- From: TwoFirst TwoLast
Sent: Thursday, June 06, 2013 1:25 AM
To: solr-user@lucene.apache.org
Subject: Schema Change: Int -> String

1) If I change one field's type in my schema, will that cause problems with
the index or searching?  My data is pulled in chunks off of a mysql server
so one field in the currently indexed data is simply an "int" type field in
solr.  I would like to change this to a string moving forward, but still
expect to search across the int/string field.  Will this be ok?

2) My motivation for #1 is that I have thousands of records that are
exactly the same in mysql aside from a user_id column.  Prior to inserting
into mysql I am thinking that I can concatenate the user_ids together into
a space separated string and let solr just parse the string.  So the
database and my data import handler would change a bit.

3) If #2 is an appropriate approach, will a solr.TextField with
a solr.WhitespaceTokenizerFactory be an ok way to approach this?  This does
produce words where I would expect integers. I tried using a
solr.TrieIntField with the solr.WhitespaceTokenizerFactory, but it throws
an error.

Finally I need to make sure that exact matches will be performed on
user_ids in the string when searching.

Much appreciated!

Reply via email to