Hi Steve,
I'm sorry. That's also CustomAnalyzer.
public class CustomAnalyzer extends Analyzer {
> @Override
> protected Analyzer.TokenStreamComponents createComponents(final String
> fieldName, final Reader reader) {
> final WhitespaceTokenizer src = new WhitespaceTokenizer(getVersion(),
> reader);
> TokenStream tok = new ShingleFilter(src, 2, 3);
> tok = new ClassicFilter(tok);
> tok = new LowerCaseFilter(tok);
> // tok = new SynonymFilter(tok,SynonymDictionary.
> getSynonymMap(),true);
> return new Analyzer.TokenStreamComponents(src, tok);
> }
> }
>
>
public class Test {
> public static void main(String[] args) throws Exception {
> CustomAnalyzer analyzer = new CustomAnalyzer();
> String queryStr = "cup board";
> TokenStream ts = new CustomAnalyzer().tokenStream("n", new
> StringReader(queryStr));
> ts.reset();
> System.out.println("Tokens are :");
> while (ts.incrementToken()) {
> System.out.print(ts.getAttribute(CharTermAttribute.class) +
> ", ");
> }
> QueryParser parser = new QueryParser("n", analyzer);
> Query query = null;
> query = parser.parse(queryStr);
> System.out.println("\nQuery is");
> System.out.print(query.toString());
> }
> }
Output:
> Tokens are :
> cup, cup board, board
> Query is n
> n:cup n:board
>
On Mon, Jul 24, 2017 at 11:08 PM, Steve Rowe <[email protected]> wrote:
> Hi hariram,
>
> There may be other problems, but at a minimum you have two different
> analysis classes here. You’re printing the output stream from one
> (CustomSynynymAnalyzer, the source of which is not shown in your email),
> but constructing a query from a different one (CustomAnalyzer).
>
> --
> Steve
> www.lucidworks.com
>
> > On Jul 24, 2017, at 10:53 AM, hariram ravichandran <
> [email protected]> wrote:
> >
> > I'm using Lucene 4.10.4 and trying to construct (shingles) combinations
> of
> > tokens.
> >
> >
> > Code:
> >
> > public class CustomAnalyzer extends Analyzer {
> > @Override
> > protected Analyzer.TokenStreamComponents createComponents(final String
> > fieldName, final Reader reader) {
> > final WhitespaceTokenizer src = new
> > WhitespaceTokenizer(getVersion(), reader);
> > TokenStream tok = new ShingleFilter(src, 2, 3);
> > tok = new ClassicFilter(tok);
> > tok = new LowerCaseFilter(tok);
> > // tok = new
> > SynonymFilter(tok,SynonymDictionary.getSynonymMap(),true);
> > return new Analyzer.TokenStreamComponents(src, tok);
> > }
> > }
> >
> > public class Test {
> > public static void main(String[] args) throws Exception {
> > CustomSynonymAnalyzer analyzer = new CustomSynonymAnalyzer();
> > String queryStr = "cup board";
> > TokenStream ts = new CustomAnalyzer().tokenStream("n", new
> > StringReader(queryStr));
> > ts.reset();
> > System.out.println("Tokens are :");
> > while (ts.incrementToken()) {
> > System.out.print(ts.getAttribute(CharTermAttribute.class) +
> ",
> > ");
> > }
> > QueryParser parser = new QueryParser("n", analyzer);
> > Query query = null;
> > query = parser.parse(queryStr);
> > System.out.println("\nQuery is");
> > System.out.print(query.toString());
> > }
> > }
> >
> >
> >
> >> Output:
> >> Tokens are :
> >> cup, cup board, board
> >> Query is n
> >> n:cup n:board
> >>
> >
> > Tokens are printed as expected. And expecting the resulting query to be
> *n:cup
> > n:board n:cup board*. But tokens formed by shingle filter are not
> appended
> > to the query. I get only *n:cup n:board.* Where is my mistake?
> >
> > Thanks.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>