I wonder what would happen if the DistributedUpdateProcessorFactory is
manually added into the chain and the LangDetect definition is moved
AFTER it. As per
https://wiki.apache.org/solr/UpdateRequestProcessor#Distributed_Updates

This would mean that the detection code would be executed on each
node, but with the record expanded to include those other fields
(assuming they were stored). This may do the trick, though a custom
URP would probably be a better solution anyway.
----
Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/


On 3 November 2015 at 05:13, Upayavira <u...@odoko.co.uk> wrote:
> Looking at the code, this is not going to work without modifications to
> Solr (or at least a custom component).
>
> The atomic update code is closely embedded into the Solr
> DistributedUpdateProcessor, which expands the atomic update into a full
> document and then posts it to the shards.
>
> You need to do the update expansion before your lang detect processor,
> but there is no gap between them.
>
> From my reading of the code, you could create an AtomicUpdateProcessor
> that simply expands updates, and insert that before the
> LangDetectUpdateProcessor.
>
> Upayavira
>
> On Tue, Nov 3, 2015, at 06:38 AM, Chaushu, Shani wrote:
>> Hi
>> When I make atomic update - set field - also on content field and also
>> another field, the language field became generic. Meaning, it doesn’t
>> work in the set field, only in the first inserting. Even if in the first
>> time the language was detected, it just became generic after the update.
>> Any idea?
>>
>> The chain is
>>
>> <updateRequestProcessorChain name="aa_chain">
>> <processor
>> class="org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessorFactory">
>> <str name="langid.fl">title,content,text</str>
>>     <str name="langid.langField">language_t</str>
>>     <str name="langid.langsField">language_all_t</str>
>>     <str name="langid.fallback">generic</str>
>>     <str name="langid.overwrite">false</str>
>>     <str name="langid.threshold">0.8</str>
>> </processor>
>> <processor class="solr.LogUpdateProcessorFactory" />
>>   <processor class="solr.RunUpdateProcessorFactory" />
>> </updateRequestProcessorChain>
>>
>>
>> Thanks,
>> Shani
>>
>>
>>
>>
>> -----Original Message-----
>> From: Jack Krupansky [mailto:jack.krupan...@gmail.com]
>> Sent: Thursday, October 29, 2015 17:04
>> To: solr-user@lucene.apache.org
>> Subject: Re: language plugin
>>
>> Are you trying to do an atomic update without the content field? If so,
>> it sounds like Solr needs an enhancement (bug fix?) so that language
>> detection would be skipped if the input field is not present. Or maybe
>> that could be an option.
>>
>>
>> -- Jack Krupansky
>>
>> On Thu, Oct 29, 2015 at 3:25 AM, Chaushu, Shani <shani.chau...@intel.com>
>> wrote:
>>
>> > Hi,
>> >  I'm using solr language detection plugin on field name "content"
>> > (solr 4.10, plugin LangDetectLanguageIdentifierUpdateProcessorFactory)
>> > When I'm indexing  on the first time it works fine, but if I want to
>> > set one field again (regardless if it's the content or not) if goes to
>> > its default language. If I'm setting other field I would like the
>> > language to stay the way it was before, and o don't want to insert all
>> > the content again. There is an option to set the plugin that it won't
>> > calculate again the language? (put langid.overwrite to false didn't
>> > work)
>> >
>> > Thanks,
>> > Shani
>> >
>> >
>> > ---------------------------------------------------------------------
>> > Intel Electronics Ltd.
>> >
>> > This e-mail and any attachments may contain confidential material for
>> > the sole use of the intended recipient(s). Any review or distribution
>> > by others is strictly prohibited. If you are not the intended
>> > recipient, please contact the sender and delete all copies.
>> >
>> ---------------------------------------------------------------------
>> Intel Electronics Ltd.
>>
>> This e-mail and any attachments may contain confidential material for
>> the sole use of the intended recipient(s). Any review or distribution
>> by others is strictly prohibited. If you are not the intended
>> recipient, please contact the sender and delete all copies.

Reply via email to