On Mar 12, 2012, at 5:32 PM, Yonik Seeley wrote:
> On Mon, Mar 12, 2012 at 8:15 AM, Mark Miller <[email protected]> wrote:
>> Currently it doesn't send directly to the leader, but this is planned - it's
>> a little tricky due to lack of access to the Schema for hashing
>
> Hmmm, why is this? Identification of the "uniqueKey" field? Maybe we
> just assume "id", or let the user configure it if it's something
> different. That should really be "best practice" along with sticking
> to normal java identifiers for field names.
Yeah, for id my plan was just let the user supply the field, perhaps default to
id. The other issue is that we hash on the indexed value though - which we get
though a customizable field type method impl last I recall. I think this tends
to be the same as the raw text for the types we care about. But we have to make
some assumptions - it's not really arbitrary support - though it should easily
cover the current common types of numeric or string. I think most impls end up
using UnicodeUtil.UTF16toUTF8 and hopefully most toInternal methods simply
return what is passed in (ie use the base class impl)...
/** Given an indexed term, append the human readable representation*/
public CharsRef indexedToReadable(BytesRef input, CharsRef output) {
UnicodeUtil.UTF8toUTF16(input, output);
return output;
}
public String toInternal(String val) {
// - used in delete when a Term needs to be created.
// - used by the default getTokenizer() and createField()
return val;
}
>
> -Yonik
> lucenerevolution.com - Lucene/Solr Open Source Search Conference.
> Boston May 7-10
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
- Mark Miller
lucidimagination.com
UnicodeUtil.UTF16toUTF8(
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]