multiterm numbers regexp search

2014-12-15 Thread Valentin Popov
e multiterm regexp search query? Regards, Valentin Popov

Re: multiterm numbers regexp search

2014-12-15 Thread Valentin Popov
from my LG Optimus G™, an AT&T 4G LTE smartphone > > -- Original message -- > From: Valentin Popov > Date: 12/15/2014 3:46 AM > To: java-user@lucene.apache.org; > Subject:multiterm numbers regexp search > > I have a need to find mastercard numbers with regular

Re: multiterm numbers regexp search

2014-12-15 Thread Valentin Popov
x27;t want to use StandardAnalyzer: maybe try > WhitespaceAnalyzer, but you'll need to enhance your regex a little to deal > with punctuation since WA may give you tokens like: > > 5106-7922-9469-8422. > > "5106-7922-9469-8422" > > etc > > -Mike > >

Re: multiterm numbers regexp search

2014-12-15 Thread Valentin Popov
with the regex to capture the "phrase" without > spaces/hyphens: "5{1}<1-5>{1}<0-9>{14}" > > I can't vouch for performance with the above options... > > Whichever path you take, make sure that the MultiTermQuery.RewriteMethod > and/or maxB

Automaton -> SpanMultiTermQueryWrapper with lucene 4.10.2

2014-12-24 Thread Valentin Popov
es. Any clue? Thanks PS. Operations.subsetOf(Automata.makeString("123-456-789"), full); => true; Operations.subsetOf(Automata.makeString("111-456-789"), full); => false. TestCase for Automata works fine. Regards, Valentin Popov

500 millions document for loop.

2015-11-12 Thread Valentin Popov
Hello everyone. We have ~10 indexes for 500M documents, each document has «archive date», and «to» address, one of our task is calculate statistics of «to» for last year. Right now we are using search archive_date:(current_date - 1 year) and paginate results for 50k records for page. Bottlenec

Re: 500 millions document for loop.

2015-11-12 Thread Valentin Popov
We are using 4.10.4 and it is not possible move right now to 5.x version. Thanks! > On 12 нояб. 2015 г., at 19:47, Anton Zenkov > wrote: > > Which version of Lucene are you using? > > > On Thu, Nov 12, 2015 at 11:39 AM, Valentin Popov > wrote: > >> H

Re: 500 millions document for loop.

2015-11-12 Thread Valentin Popov
Toke, thanks! We will look at this solution, looks like this is that what we need. > On 12 нояб. 2015 г., at 20:42, Toke Eskildsen > wrote: > > Valentin Popov wrote: > >> We have ~10 indexes for 500M documents, each document >> has «archive date», and «to»

Re: 500 millions document for loop.

2015-11-12 Thread Valentin Popov
es the results. Main bottleneck as I see come from next page search, that took ~2-4 seconds. > > Uwe > > - > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > >> -Original Message- >> From: Val

Re: 500 millions document for loop.

2015-11-12 Thread Valentin Popov
= null; if (searchResult.getCancelled()) { return searchResult; } } > On 12 нояб. 2015 г., at 20:42, Toke Eskildsen > wrote: > > Val

Re: 500 millions document for loop.

2015-11-12 Thread Valentin Popov
t;> Uwe >>> >>> - >>> Uwe Schindler >>> H.-H.-Meier-Allee 63, D-28213 Bremen >>> http://www.thetaphi.de >>> eMail: u...@thetaphi.de >>> >>>> -Original Message- >>>> From: Valentin Popov [mailto:va

Re: 500 millions document for loop.

2015-11-12 Thread Valentin Popov
gt; larger, until it finally collects all results. > > So just get the results as a stream by implementing the Collector API is the > right way to do this. > >>> >>> Uwe >>> >>> - >>> Uwe Schindler >>> H.-H.-Meier-Allee 63, D-2

Re: 500 millions document for loop.

2015-11-14 Thread Valentin Popov
> So just get the results as a stream by implementing the Collector API is the > right way to do this. > >>> >>> Uwe >>> >>> - >>> Uwe Schindler >>> H.-H.-Meier-Allee 63, D-28213 Bremen >>> http://www.thetaphi.de >&g

Re: 500 millions document for loop.

2015-11-14 Thread Valentin Popov
return true; > } > }); > > Otherwise you get wrong document ids reported!!! > > Uwe > > - > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > >> --

Re: 500 millions document for loop.

2015-11-14 Thread Valentin Popov
thetaphi.de > eMail: u...@thetaphi.de > >> -Original Message- >> From: Valentin Popov [mailto:valentin...@gmail.com] >> Sent: Saturday, November 14, 2015 1:51 PM >> To: java-user@lucene.apache.org >> Subject: Re: 500 millions document for loop. >> >>

Re: 500 millions document for loop.

2015-11-14 Thread Valentin Popov
t; > - > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > >> -Original Message- >> From: Valentin Popov [mailto:valentin...@gmail.com] >> Sent: Saturday, November 14, 2015 1:51 PM >> To: java-u

Re: 500 millions document for loop.

2016-04-21 Thread Valentin Popov
------ > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org Regards, Valentin Popov - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: 500 millions document for loop.

2016-04-26 Thread Valentin Popov
H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > >> -Original Message- >> From: Valentin Popov [mailto:valentin...@gmail.com] >> Sent: Saturday, November 14, 2015 1:51 PM >> To: java-user@lucene.apache.org >> Su

complex disjoint search query

2016-10-12 Thread Valentin Popov
Hi all, I broke my mind to figure out one search problem. I have a field «To» that store domains. Like example To:local.one, other.one, third.one. I have a set of a domain’s that are local, in this example it is «local.one», and set of non local domains are infinitive. I need to search all doc

Re: complex disjoint search query

2016-10-13 Thread Valentin Popov
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > Regards, Valentin Popov - To unsu

is Document match Query

2018-12-17 Thread Valentin Popov
Hello. I need implement a feature, that answer for a question: is a Document match a Query. Right now, I’m implemented this such way: 1. Use RadDirectory 2. Index Document 3. Search used Query 4. If any doc match, this is mean Document match Query. Problem with this approach, it is too slow

Re: is Document match Query

2018-12-17 Thread Valentin Popov
Anton, thanks. This is exact I search for. пн, 17 дек. 2018 г. в 19:30, Anton Zenkov : > > https://lucene.apache.org/core/7_6_0/memory/org/apache/lucene/index/memory/MemoryIndex.html > > Anton > > On Mon, Dec 17, 2018 at 8:06 AM Valentin Popov > wrote: > > > Hel

Re: is Document match Query

2018-12-17 Thread Valentin Popov
ID? > > If you just need to know if any of N queries match the doc, you could > check several at once with a big OR clause. > > Best, > Erick > On Mon, Dec 17, 2018 at 5:06 AM Valentin Popov > wrote: > > > > Hello. > > > > I need implement a feature

fields contains equals term docs search

2019-04-19 Thread Valentin Popov
Hi, I trying find the way, to search all docs has equals term on different fields. Like doc1 {"foo":"master", "bar":"master"} doc2 {"foo":"test", "bar":"master"} As result should be doc1 only. Right now, I'm get all terms for "foo", "bar" intersect it and get all terms could be both "foo", "bar"

Re: fields contains equals term docs search

2019-04-21 Thread Valentin Popov
static approaches? > I would index an auxiliary field which has binary values (0/1 or > "T"/"F") representing "has equals term on different fields" > so that you can filtering out the docs (maybe by constant score query). > > Tomoko > > 2019年4月20