What's you actual business use case? On 30 Mar 2017 1:53 AM, "Derek Poh" <d...@globalsources.com> wrote:
> Hi Erick > > So I could also not use the query analyzer stage to append the code to the > search keyword? > Have the front-end application append the code for every query it issue > instead? > > > On 3/30/2017 12:20 PM, Erick Erickson wrote: > >> I generally prefer index-time work to query-time work on the theory >> that the index-time work is done once and the query time work is done >> for each query. >> >> That said, for a corpus this size (and presumably without a large >> query rate) I doubt you'd be able to measure any difference. >> >> So basically choose the easiest to implement IMO. >> >> Best, >> Erick >> >> On Wed, Mar 29, 2017 at 8:43 PM, Alexandre Rafalovitch >> <arafa...@gmail.com> wrote: >> >>> I am not sure I can tell how to decide on one or another. However, I >>> wanted to mention that you also have an option of doing in in the >>> UpdateRequestProcessor chain. That's still within Solr (and therefore >>> is consistent with multiple clients feeding into Solr) but is before >>> individual field processing (so will survive - for example - a >>> copyField). >>> >>> Regards, >>> Alex. >>> ---- >>> http://www.solr-start.com/ - Resources for Solr users, new and >>> experienced >>> >>> >>> On 29 March 2017 at 23:38, Derek Poh <d...@globalsources.com> wrote: >>> >>>> Hi >>>> >>>> Ineed to create afield that will be prefix and suffix with code >>>> 'z01x'.This >>>> field needs to have the code in the index and during query. >>>> I can either >>>> 1. >>>> have the source data of the field formatted with the code before >>>> indexing >>>> (outside solr). >>>> use a charFilter in the query stage of the field typeto add the >>>> codeduring >>>> query. >>>> >>>> <charFilter class="solr.PatternReplaceCharFilterFactory" >>>> pattern="^(.*)$" >>>> replacement="z01x $1 z01x" /> >>>> >>>> OR >>>> >>>> 2. >>>> use the charFilter before tokenizerclass during the index and query >>>> analyzer >>>> stage of the field type. >>>> >>>> The collection has between 100k - 200k documents currentlybut it may >>>> increase in the future. >>>> Theindexing time with option 2 and current indexing time is almost the >>>> same, >>>> between 2-3 minutes. >>>> >>>> Which option would you advice? >>>> >>>> Derek >>>> >>>> ---------------------- >>>> CONFIDENTIALITY NOTICE >>>> This e-mail (including any attachments) may contain confidential and/or >>>> privileged information. If you are not the intended recipient or have >>>> received this e-mail in error, please inform the sender immediately and >>>> delete this e-mail (including any attachments) from your computer, and >>>> you >>>> must not use, disclose to anyone else or copy this e-mail (including any >>>> attachments), whether in whole or in part. >>>> This e-mail and any reply to it may be monitored for security, legal, >>>> regulatory compliance and/or other appropriate reasons. >>>> >>> >> > > ---------------------- > CONFIDENTIALITY NOTICE > This e-mail (including any attachments) may contain confidential and/or > privileged information. If you are not the intended recipient or have > received this e-mail in error, please inform the sender immediately and > delete this e-mail (including any attachments) from your computer, and you > must not use, disclose to anyone else or copy this e-mail (including any > attachments), whether in whole or in part. > This e-mail and any reply to it may be monitored for security, legal, > regulatory compliance and/or other appropriate reasons.