e
multiterm regexp search query?
Regards,
Valentin Popov
from my LG Optimus G™, an AT&T 4G LTE smartphone
>
> -- Original message --
> From: Valentin Popov
> Date: 12/15/2014 3:46 AM
> To: java-user@lucene.apache.org;
> Subject:multiterm numbers regexp search
>
> I have a need to find mastercard numbers with regular
x27;t want to use StandardAnalyzer: maybe try
> WhitespaceAnalyzer, but you'll need to enhance your regex a little to deal
> with punctuation since WA may give you tokens like:
>
> 5106-7922-9469-8422.
>
> "5106-7922-9469-8422"
>
> etc
>
> -Mike
>
>
with the regex to capture the "phrase" without
> spaces/hyphens: "5{1}<1-5>{1}<0-9>{14}"
>
> I can't vouch for performance with the above options...
>
> Whichever path you take, make sure that the MultiTermQuery.RewriteMethod
> and/or maxB
es.
Any clue?
Thanks
PS. Operations.subsetOf(Automata.makeString("123-456-789"), full); => true;
Operations.subsetOf(Automata.makeString("111-456-789"), full); => false.
TestCase for Automata works fine.
Regards,
Valentin Popov
Hello everyone.
We have ~10 indexes for 500M documents, each document has «archive date», and
«to» address, one of our task is calculate statistics of «to» for last year.
Right now we are using search archive_date:(current_date - 1 year) and paginate
results for 50k records for page. Bottlenec
We are using 4.10.4 and it is not possible move right now to 5.x version.
Thanks!
> On 12 нояб. 2015 г., at 19:47, Anton Zenkov
> wrote:
>
> Which version of Lucene are you using?
>
>
> On Thu, Nov 12, 2015 at 11:39 AM, Valentin Popov
> wrote:
>
>> H
Toke, thanks!
We will look at this solution, looks like this is that what we need.
> On 12 нояб. 2015 г., at 20:42, Toke Eskildsen
> wrote:
>
> Valentin Popov wrote:
>
>> We have ~10 indexes for 500M documents, each document
>> has «archive date», and «to»
es the results.
Main bottleneck as I see come from next page search, that took ~2-4 seconds.
>
> Uwe
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>> -Original Message-
>> From: Val
= null;
if (searchResult.getCancelled()) {
return searchResult;
}
}
> On 12 нояб. 2015 г., at 20:42, Toke Eskildsen
> wrote:
>
> Val
t;> Uwe
>>>
>>> -
>>> Uwe Schindler
>>> H.-H.-Meier-Allee 63, D-28213 Bremen
>>> http://www.thetaphi.de
>>> eMail: u...@thetaphi.de
>>>
>>>> -Original Message-
>>>> From: Valentin Popov [mailto:va
gt; larger, until it finally collects all results.
>
> So just get the results as a stream by implementing the Collector API is the
> right way to do this.
>
>>>
>>> Uwe
>>>
>>> -
>>> Uwe Schindler
>>> H.-H.-Meier-Allee 63, D-2
> So just get the results as a stream by implementing the Collector API is the
> right way to do this.
>
>>>
>>> Uwe
>>>
>>> -
>>> Uwe Schindler
>>> H.-H.-Meier-Allee 63, D-28213 Bremen
>>> http://www.thetaphi.de
>&g
return true;
> }
> });
>
> Otherwise you get wrong document ids reported!!!
>
> Uwe
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>> --
thetaphi.de
> eMail: u...@thetaphi.de
>
>> -Original Message-
>> From: Valentin Popov [mailto:valentin...@gmail.com]
>> Sent: Saturday, November 14, 2015 1:51 PM
>> To: java-user@lucene.apache.org
>> Subject: Re: 500 millions document for loop.
>>
>>
t;
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>> -Original Message-
>> From: Valentin Popov [mailto:valentin...@gmail.com]
>> Sent: Saturday, November 14, 2015 1:51 PM
>> To: java-u
------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
Regards,
Valentin Popov
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>> -Original Message-
>> From: Valentin Popov [mailto:valentin...@gmail.com]
>> Sent: Saturday, November 14, 2015 1:51 PM
>> To: java-user@lucene.apache.org
>> Su
Hi all,
I broke my mind to figure out one search problem. I have a field «To» that
store domains. Like example To:local.one, other.one, third.one. I have a set of
a domain’s that are local, in this example it is «local.one», and set of non
local domains are infinitive. I need to search all doc
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
Regards,
Valentin Popov
-
To unsu
Hello.
I need implement a feature, that answer for a question: is a Document match a
Query.
Right now, I’m implemented this such way:
1. Use RadDirectory
2. Index Document
3. Search used Query
4. If any doc match, this is mean Document match Query.
Problem with this approach, it is too slow
Anton, thanks.
This is exact I search for.
пн, 17 дек. 2018 г. в 19:30, Anton Zenkov :
>
> https://lucene.apache.org/core/7_6_0/memory/org/apache/lucene/index/memory/MemoryIndex.html
>
> Anton
>
> On Mon, Dec 17, 2018 at 8:06 AM Valentin Popov
> wrote:
>
> > Hel
ID?
>
> If you just need to know if any of N queries match the doc, you could
> check several at once with a big OR clause.
>
> Best,
> Erick
> On Mon, Dec 17, 2018 at 5:06 AM Valentin Popov
> wrote:
> >
> > Hello.
> >
> > I need implement a feature
Hi,
I trying find the way, to search all docs has equals term on different
fields. Like
doc1 {"foo":"master", "bar":"master"}
doc2 {"foo":"test", "bar":"master"}
As result should be doc1 only.
Right now, I'm get all terms for "foo", "bar" intersect it and get all
terms could be both "foo", "bar"
static approaches?
> I would index an auxiliary field which has binary values (0/1 or
> "T"/"F") representing "has equals term on different fields"
> so that you can filtering out the docs (maybe by constant score query).
>
> Tomoko
>
> 2019年4月20
25 matches
Mail list logo