The default regex package is java.util.regex and I can't see anywhere that you tell it to use the Jakarta regexp package. So I don't think that ".in" will match. Also, you are storing your contents field as NOT_ANALYZED so you will need to be wary of case sensitivity. Maybe this is what you want, but maybe not.
-- Ian. On Mon, May 11, 2009 at 9:00 AM, Huntsman84 <tpgarci...@gmail.com> wrote: > > This is the code for searching: > > String index = "index"; > String field = "contents"; > IndexReader reader = IndexReader.open(index); > Searcher searcher = new IndexSearcher(reader); > > System.out.println("Enter query: "); > String line = ".IN.";//in jakarta regexp this is like * IN * > RegexQuery rxquery = new RegexQuery(new Term(field,line)); > Hits hits = searcher.search(rxquery); > > if(hits!=null){ > for(int k = 0; k<100 && k<hits.length(); k++){ > if(hits.doc(k)!=null) > System.out.println(hits.doc(k).getField("contents").stringValue()); > } > } > > > > And this is the part of creating the index: > > > File directory = new File("index"); > IndexWriter writer = new IndexWriter(directory, new StandardAnalyzer(), > true, > IndexWriter.MaxFieldLength.LIMITED); > List<String> records = getRecords();//returns a list of record values from > database, all of them are phrases > Iterator<String> i = records.iterator(); > while(i.hasNext()){ > Document doc = new Document(); > doc.add(new Field(field, i.next(), Field.Store.YES, > Field.Index.NOT_ANALYZED)); > writer.addDocument(doc); > } > writer.optimize(); > writer.close(); > > > > This code works as I want but just matching with the first word of the > phrase. I think the problem is the index building, but I don't know how to > fix it... > > Any ideas? > > Thank you so much!! > > > > Steven A Rowe wrote: >> >> On 5/8/2009 at 9:13 AM, Ian Lee wrote: >>> I'm surprised that it matches either - don't you need ".*in" where .* >>> means match any character zero or more times? See the javadoc for >>> java.util.regex.Pattern, or for Jakarta Regexp if you are using that >>> package. >>> >>> Unless you're an expert in regexps it is probably worth playing with >>> them outside your lucene code to start with e.g. with simple >>> String.matches(regexp) calls. They can take some getting used to. >>> And try to avoid anything with backslashes if you can! >> >> The java.util.regex.Pattern implementation (the default RegexQuery >> implementation) actually uses Matcher.lookingAt(), which is equivalent to >> prepending a "^" anchor to the beginning of the pattern, so if Huntsman84 >> is using the default implementation, then I agree with Ian: I'm surprised >> it matches either. >> >> However, the Jakarta Regexp implementation uses RE.match(), which does >> *not* require a beginning-of-string match. >> >> Hunstman84, are you using the Jakarta Regexp implementation? If so, then >> like you, I'm surprised it's not matching both :). >> >> It would be useful to see some real code, including how you index your >> records. >> >> Steve >> >>> On Fri, May 8, 2009 at 1:42 PM, Huntsman84 <tpgarci...@gmail.com> >>> wrote: >>> > >>> > Hi, >>> > >>> > I am using RegexQuery for searching in a set of records wich are >>> > phrases of several words each. My aim is to find any phrase that >>> > contains the given group of letters (e.g. "in"). For that case, >>> > I am building the query with the regular expression ".in.", so it >>> > should return all phrases with contain "in", but the search only >>> > matches with the first word of the phrase. >>> > >>> > For example, if my records are "Knowing yourself" and "Old >>> > clinic", the correct search would return 2 matches, but it only >>> > matches with "Knowing yourself". >>> > >>> > How could I fix this? >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> >> > > -- > View this message in context: > http://www.nabble.com/RegexQuery-Incomplete-Results-tp23445235p23478720.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org