Hi Uwe, I am trying to implement regex search in file the same as in editors, in Notepad++ for example.
Thanks, Ira -----Original Message----- From: Uwe Schindler <[email protected]> Sent: Tuesday, July 31, 2018 6:12 PM To: [email protected] Subject: RE: Search in lines, so need to index lines? Hi, you need to create your own tokenizer that splits tokens on \n or \r. Instead of using WhitespaceTokenizer, you can use: Tokenizer tok = CharTokenizer. fromSeparatorCharPredicate(ch -> ch=='\r' || ch=='\n'); But I would first think of how to implement the whole thing correctly. Using a regular expression as "default" query is slow and does not look correct. What are you trying to do? Uwe ----- Uwe Schindler Achterdiek 19, D-28357 Bremen http://www.thetaphi.de eMail: [email protected] > -----Original Message----- > From: Gordin, Ira <[email protected]> > Sent: Tuesday, July 31, 2018 4:08 PM > To: [email protected] > Subject: Search in lines, so need to index lines? > > Hi all, > > I understand Lucene knows to find query matches in tokens. For example if I > use WhiteSpaceTokenizer and I am searching with /.*nice day.*/ regular > expression, I'll always find nothing. Am I correct? > In my project I need to find matches inside lines and not inside words, so I > am considering to tokenize lines. How I should to implement this idea? > I'll really appriciate you have more ideas/implementations. > > Thanks in advance, > Ira > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
