If you reindex, I’ve become a big fan of adding a date field with an index 
timestamp.
That will allow you to check whether everything has been reindexed.

   <field name="indexed_datetime" type="date" stored="true" indexed="true"
           multiValued="false" default="NOW" docValues="true" />

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Jul 28, 2020, at 2:11 PM, Jörn Franke <jornfra...@gmail.com> wrote:
> 
> A regex search at query time would leave room for attacks (eg a regex can 
> easily be designed to block the Solr server forever).
> 
> If the field is store you can also try to use a cursor to go through all 
> entries using a cursor and reindex the doc based on the field:
> 
> https://lucene.apache.org/solr/guide/8_4/pagination-of-results.html
> 
> This would also imply that you have the other fields stored. Otherwise 
> reindex.
> You can do this in parallel to the existing index and once finished simply 
> change the alias for the collection (that would be without any downtime for 
> the users but you require of course corresponding space).
> 
>> Am 28.07.2020 um 21:06 schrieb lstusr 5u93n4 <lstusr...@gmail.com>:
>> 
>> Possible... yes. Agreed that this is the right approach. But if we already
>> have a big index that we're searching through? Any way to "hack it"?
>> 
>>> On Tue, 28 Jul 2020 at 14:55, Walter Underwood <wun...@wunderwood.org>
>>> wrote:
>>> 
>>> I’d do that at index time. Add an update request processor script that
>>> does the regex and adds a field has_credit_card_number:true.
>>> 
>>> wunder
>>> Walter Underwood
>>> wun...@wunderwood.org
>>> http://observer.wunderwood.org/  (my blog)
>>> 
>>>>> On Jul 28, 2020, at 11:50 AM, lstusr 5u93n4 <lstusr...@gmail.com> wrote:
>>>> 
>>>> Let's say I have a text field that's been indexed with the standard
>>>> tokenizer, and I want to match the docs that have credit card numbers in
>>>> them (this is for altruistic purposes, not nefarious ones!). What's the
>>>> best way to build a search that will do this?
>>>> 
>>>> Searching for "???? ???? ???? ????" seems to return inconsistent results.
>>>> 
>>>> Maybe a regex search? "[0-9]{4}?[0-9]{4}?[0-9]{4}?[0-9]{4}" seems like it
>>>> should work, but that's not matching the docs I think it should either...
>>>> 
>>>> Any suggestions?
>>>> 
>>>> Thanks In Advance!
>>> 
>>> 

Reply via email to