Re: Tokens produced by Shingle filter are not added in the query

hariram ravichandran Mon, 24 Jul 2017 11:01:18 -0700

Hi Steve,
    I'm sorry. That's also CustomAnalyzer.

public class CustomAnalyzer extends Analyzer {
>     @Override
>     protected Analyzer.TokenStreamComponents createComponents(final String
> fieldName, final Reader reader) {
>         final WhitespaceTokenizer src = new WhitespaceTokenizer(getVersion(),
> reader);
>         TokenStream tok = new ShingleFilter(src, 2, 3);
>         tok = new ClassicFilter(tok);
>         tok = new LowerCaseFilter(tok);
> //        tok = new SynonymFilter(tok,SynonymDictionary.
> getSynonymMap(),true);
>         return new Analyzer.TokenStreamComponents(src, tok);
>     }
> }
>
>
public class Test {
>     public static void main(String[] args) throws Exception {
>         CustomAnalyzer analyzer = new CustomAnalyzer();
>         String queryStr = "cup board";
>         TokenStream ts = new CustomAnalyzer().tokenStream("n", new
> StringReader(queryStr));
>         ts.reset();
>         System.out.println("Tokens are :");
>         while (ts.incrementToken()) {
>             System.out.print(ts.getAttribute(CharTermAttribute.class) +
> ", ");
>         }
>         QueryParser parser = new QueryParser("n", analyzer);
>         Query query = null;
>         query = parser.parse(queryStr);
>         System.out.println("\nQuery is");
>         System.out.print(query.toString());
>     }
> }



Output:
> Tokens are :
> cup, cup board, board
> Query is n
> n:cup n:board
>


On Mon, Jul 24, 2017 at 11:08 PM, Steve Rowe <[email protected]> wrote:

> Hi hariram,
>
> There may be other problems, but at a minimum you have two different
> analysis classes here.  You’re printing the output stream from one
> (CustomSynynymAnalyzer, the source of which is not shown in your email),
> but constructing a query from a different one (CustomAnalyzer).
>
> --
> Steve
> www.lucidworks.com
>
> > On Jul 24, 2017, at 10:53 AM, hariram ravichandran <
> [email protected]> wrote:
> >
> > I'm using Lucene 4.10.4 and trying to construct (shingles) combinations
> of
> > tokens.
> >
> >
> > Code:
> >
> > public class CustomAnalyzer extends Analyzer {
> >    @Override
> >    protected Analyzer.TokenStreamComponents createComponents(final String
> > fieldName, final Reader reader) {
> >        final WhitespaceTokenizer src = new
> > WhitespaceTokenizer(getVersion(), reader);
> >        TokenStream tok = new ShingleFilter(src, 2, 3);
> >        tok = new ClassicFilter(tok);
> >        tok = new LowerCaseFilter(tok);
> > //        tok = new
> > SynonymFilter(tok,SynonymDictionary.getSynonymMap(),true);
> >        return new Analyzer.TokenStreamComponents(src, tok);
> >    }
> > }
> >
> > public class Test {
> >    public static void main(String[] args) throws Exception {
> >        CustomSynonymAnalyzer analyzer = new CustomSynonymAnalyzer();
> >        String queryStr = "cup board";
> >        TokenStream ts = new CustomAnalyzer().tokenStream("n", new
> > StringReader(queryStr));
> >        ts.reset();
> >        System.out.println("Tokens are :");
> >        while (ts.incrementToken()) {
> >            System.out.print(ts.getAttribute(CharTermAttribute.class) +
> ",
> > ");
> >        }
> >        QueryParser parser = new QueryParser("n", analyzer);
> >        Query query = null;
> >        query = parser.parse(queryStr);
> >        System.out.println("\nQuery is");
> >        System.out.print(query.toString());
> >    }
> > }
> >
> >
> >
> >> Output:
> >> Tokens are :
> >> cup, cup board, board
> >> Query is n
> >> n:cup n:board
> >>
> >
> > Tokens are printed as expected. And expecting the resulting query to be
> *n:cup
> > n:board n:cup board*. But tokens formed by shingle filter are not
> appended
> > to the query. I get only *n:cup n:board.* Where is my mistake?
> >
> > Thanks.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Re: Tokens produced by Shingle filter are not added in the query

Reply via email to