subject:"\[jira\] \[Commented\] \(LUCENE\-6339\) \[suggest\] Near real time Document Suggester"

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-04-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14488015#comment-14488015
 ] 

ASF subversion and git services commented on LUCENE-6339:
-

Commit 1672458 from [~steve_rowe] in branch 'dev/trunk'
[ https://svn.apache.org/r1672458 ]

LUCENE-6339: Maven config: add resource dir src/resources/ to the POM.

 [suggest] Near real time Document Suggester
 ---

 Key: LUCENE-6339
 URL: https://issues.apache.org/jira/browse/LUCENE-6339
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 5.0
Reporter: Areek Zillur
Assignee: Areek Zillur
 Fix For: Trunk, 5.1

 Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, 
 LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch


 The idea is to index documents with one or more *SuggestField*(s) and be able 
 to suggest documents with a *SuggestField* value that matches a given key.
 A SuggestField can be assigned a numeric weight to be used to score the 
 suggestion at query time.
 Document suggestion can be done on an indexed *SuggestField*. The document 
 suggester can filter out deleted documents in near real-time. The suggester 
 can filter out documents based on a Filter (note: may change to a non-scoring 
 query?) at query time.
 A custom postings format (CompletionPostingsFormat) is used to index 
 SuggestField(s) and perform document suggestions.
 h4. Usage
 {code:java}
   // hook up custom postings format
   // indexAnalyzer for SuggestField
   Analyzer analyzer = ...
   IndexWriterConfig config = new IndexWriterConfig(analyzer);
   Codec codec = new Lucene50Codec() {
 PostingsFormat completionPostingsFormat = new 
 Completion50PostingsFormat();
 @Override
 public PostingsFormat getPostingsFormatForField(String field) {
   if (isSuggestField(field)) {
 return completionPostingsFormat;
   }
   return super.getPostingsFormatForField(field);
 }
   };
   config.setCodec(codec);
   IndexWriter writer = new IndexWriter(dir, config);
   // index some documents with suggestions
   Document doc = new Document();
   doc.add(new SuggestField(suggest_title, title1, 2));
   doc.add(new SuggestField(suggest_name, name1, 3));
   writer.addDocument(doc)
   ...
   // open an nrt reader for the directory
   DirectoryReader reader = DirectoryReader.open(writer, false);
   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
   // queryAnalyzer will be used to analyze the query string
   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
 queryAnalyzer);
   
   // suggest 10 documents for titl on suggest_title field
   TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10);
 {code}
 h4. Indexing
 Index analyzer set through *IndexWriterConfig*
 {code:java}
 SuggestField(String name, String value, long weight) 
 {code}
 h4. Query
 Query analyzer set through *SuggestIndexSearcher*.
 Hits are collected in descending order of the suggestion's weight 
 {code:java}
 // full options for TopSuggestDocs (TopDocs)
 TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)
 // full options for Collector
 // note: only collects does not score
 void suggest(String field, CharSequence key, int num, Filter filter, 
 TopSuggestDocsCollector collector) 
 {code}
 h4. Analyzer
 *CompletionAnalyzer* can be used instead to wrap another analyzer to tune 
 suggest field only parameters. 
 {code:java}
 CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean 
 preservePositionIncrements, int maxGraphExpansions)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-04-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14488027#comment-14488027
 ] 

ASF subversion and git services commented on LUCENE-6339:
-

Commit 1672461 from [~steve_rowe] in branch 'dev/branches/lucene_solr_5_1'
[ https://svn.apache.org/r1672461 ]

LUCENE-6339: Maven config: add resource dir src/resources/ to the POM. (merged 
trunk r1672458)

 [suggest] Near real time Document Suggester
 ---

 Key: LUCENE-6339
 URL: https://issues.apache.org/jira/browse/LUCENE-6339
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 5.0
Reporter: Areek Zillur
Assignee: Areek Zillur
 Fix For: Trunk, 5.1

 Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, 
 LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch


 The idea is to index documents with one or more *SuggestField*(s) and be able 
 to suggest documents with a *SuggestField* value that matches a given key.
 A SuggestField can be assigned a numeric weight to be used to score the 
 suggestion at query time.
 Document suggestion can be done on an indexed *SuggestField*. The document 
 suggester can filter out deleted documents in near real-time. The suggester 
 can filter out documents based on a Filter (note: may change to a non-scoring 
 query?) at query time.
 A custom postings format (CompletionPostingsFormat) is used to index 
 SuggestField(s) and perform document suggestions.
 h4. Usage
 {code:java}
   // hook up custom postings format
   // indexAnalyzer for SuggestField
   Analyzer analyzer = ...
   IndexWriterConfig config = new IndexWriterConfig(analyzer);
   Codec codec = new Lucene50Codec() {
 PostingsFormat completionPostingsFormat = new 
 Completion50PostingsFormat();
 @Override
 public PostingsFormat getPostingsFormatForField(String field) {
   if (isSuggestField(field)) {
 return completionPostingsFormat;
   }
   return super.getPostingsFormatForField(field);
 }
   };
   config.setCodec(codec);
   IndexWriter writer = new IndexWriter(dir, config);
   // index some documents with suggestions
   Document doc = new Document();
   doc.add(new SuggestField(suggest_title, title1, 2));
   doc.add(new SuggestField(suggest_name, name1, 3));
   writer.addDocument(doc)
   ...
   // open an nrt reader for the directory
   DirectoryReader reader = DirectoryReader.open(writer, false);
   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
   // queryAnalyzer will be used to analyze the query string
   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
 queryAnalyzer);
   
   // suggest 10 documents for titl on suggest_title field
   TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10);
 {code}
 h4. Indexing
 Index analyzer set through *IndexWriterConfig*
 {code:java}
 SuggestField(String name, String value, long weight) 
 {code}
 h4. Query
 Query analyzer set through *SuggestIndexSearcher*.
 Hits are collected in descending order of the suggestion's weight 
 {code:java}
 // full options for TopSuggestDocs (TopDocs)
 TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)
 // full options for Collector
 // note: only collects does not score
 void suggest(String field, CharSequence key, int num, Filter filter, 
 TopSuggestDocsCollector collector) 
 {code}
 h4. Analyzer
 *CompletionAnalyzer* can be used instead to wrap another analyzer to tune 
 suggest field only parameters. 
 {code:java}
 CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean 
 preservePositionIncrements, int maxGraphExpansions)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-04-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14488021#comment-14488021
 ] 

ASF subversion and git services commented on LUCENE-6339:
-

Commit 1672459 from [~steve_rowe] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1672459 ]

LUCENE-6339: Maven config: add resource dir src/resources/ to the POM. (merged 
trunk r1672458)

 [suggest] Near real time Document Suggester
 ---

 Key: LUCENE-6339
 URL: https://issues.apache.org/jira/browse/LUCENE-6339
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 5.0
Reporter: Areek Zillur
Assignee: Areek Zillur
 Fix For: Trunk, 5.1

 Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, 
 LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch


 The idea is to index documents with one or more *SuggestField*(s) and be able 
 to suggest documents with a *SuggestField* value that matches a given key.
 A SuggestField can be assigned a numeric weight to be used to score the 
 suggestion at query time.
 Document suggestion can be done on an indexed *SuggestField*. The document 
 suggester can filter out deleted documents in near real-time. The suggester 
 can filter out documents based on a Filter (note: may change to a non-scoring 
 query?) at query time.
 A custom postings format (CompletionPostingsFormat) is used to index 
 SuggestField(s) and perform document suggestions.
 h4. Usage
 {code:java}
   // hook up custom postings format
   // indexAnalyzer for SuggestField
   Analyzer analyzer = ...
   IndexWriterConfig config = new IndexWriterConfig(analyzer);
   Codec codec = new Lucene50Codec() {
 PostingsFormat completionPostingsFormat = new 
 Completion50PostingsFormat();
 @Override
 public PostingsFormat getPostingsFormatForField(String field) {
   if (isSuggestField(field)) {
 return completionPostingsFormat;
   }
   return super.getPostingsFormatForField(field);
 }
   };
   config.setCodec(codec);
   IndexWriter writer = new IndexWriter(dir, config);
   // index some documents with suggestions
   Document doc = new Document();
   doc.add(new SuggestField(suggest_title, title1, 2));
   doc.add(new SuggestField(suggest_name, name1, 3));
   writer.addDocument(doc)
   ...
   // open an nrt reader for the directory
   DirectoryReader reader = DirectoryReader.open(writer, false);
   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
   // queryAnalyzer will be used to analyze the query string
   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
 queryAnalyzer);
   
   // suggest 10 documents for titl on suggest_title field
   TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10);
 {code}
 h4. Indexing
 Index analyzer set through *IndexWriterConfig*
 {code:java}
 SuggestField(String name, String value, long weight) 
 {code}
 h4. Query
 Query analyzer set through *SuggestIndexSearcher*.
 Hits are collected in descending order of the suggestion's weight 
 {code:java}
 // full options for TopSuggestDocs (TopDocs)
 TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)
 // full options for Collector
 // note: only collects does not score
 void suggest(String field, CharSequence key, int num, Filter filter, 
 TopSuggestDocsCollector collector) 
 {code}
 h4. Analyzer
 *CompletionAnalyzer* can be used instead to wrap another analyzer to tune 
 suggest field only parameters. 
 {code:java}
 CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean 
 preservePositionIncrements, int maxGraphExpansions)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-04-07 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483734#comment-14483734
 ] 

ASF subversion and git services commented on LUCENE-6339:
-

Commit 1671914 from [~areek] in branch 'dev/trunk'
[ https://svn.apache.org/r1671914 ]

LUCENE-6339: fix test (take into account inadmissible filtered search for 
multiple segments)

 [suggest] Near real time Document Suggester
 ---

 Key: LUCENE-6339
 URL: https://issues.apache.org/jira/browse/LUCENE-6339
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 5.0
Reporter: Areek Zillur
Assignee: Areek Zillur
 Fix For: Trunk, 5.1

 Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, 
 LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch


 The idea is to index documents with one or more *SuggestField*(s) and be able 
 to suggest documents with a *SuggestField* value that matches a given key.
 A SuggestField can be assigned a numeric weight to be used to score the 
 suggestion at query time.
 Document suggestion can be done on an indexed *SuggestField*. The document 
 suggester can filter out deleted documents in near real-time. The suggester 
 can filter out documents based on a Filter (note: may change to a non-scoring 
 query?) at query time.
 A custom postings format (CompletionPostingsFormat) is used to index 
 SuggestField(s) and perform document suggestions.
 h4. Usage
 {code:java}
   // hook up custom postings format
   // indexAnalyzer for SuggestField
   Analyzer analyzer = ...
   IndexWriterConfig config = new IndexWriterConfig(analyzer);
   Codec codec = new Lucene50Codec() {
 PostingsFormat completionPostingsFormat = new 
 Completion50PostingsFormat();
 @Override
 public PostingsFormat getPostingsFormatForField(String field) {
   if (isSuggestField(field)) {
 return completionPostingsFormat;
   }
   return super.getPostingsFormatForField(field);
 }
   };
   config.setCodec(codec);
   IndexWriter writer = new IndexWriter(dir, config);
   // index some documents with suggestions
   Document doc = new Document();
   doc.add(new SuggestField(suggest_title, title1, 2));
   doc.add(new SuggestField(suggest_name, name1, 3));
   writer.addDocument(doc)
   ...
   // open an nrt reader for the directory
   DirectoryReader reader = DirectoryReader.open(writer, false);
   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
   // queryAnalyzer will be used to analyze the query string
   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
 queryAnalyzer);
   
   // suggest 10 documents for titl on suggest_title field
   TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10);
 {code}
 h4. Indexing
 Index analyzer set through *IndexWriterConfig*
 {code:java}
 SuggestField(String name, String value, long weight) 
 {code}
 h4. Query
 Query analyzer set through *SuggestIndexSearcher*.
 Hits are collected in descending order of the suggestion's weight 
 {code:java}
 // full options for TopSuggestDocs (TopDocs)
 TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)
 // full options for Collector
 // note: only collects does not score
 void suggest(String field, CharSequence key, int num, Filter filter, 
 TopSuggestDocsCollector collector) 
 {code}
 h4. Analyzer
 *CompletionAnalyzer* can be used instead to wrap another analyzer to tune 
 suggest field only parameters. 
 {code:java}
 CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean 
 preservePositionIncrements, int maxGraphExpansions)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-04-07 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483736#comment-14483736
 ] 

ASF subversion and git services commented on LUCENE-6339:
-

Commit 1671916 from [~areek] in branch 'dev/branches/lucene_solr_5_1'
[ https://svn.apache.org/r1671916 ]

LUCENE-6339: fix test (take into account inadmissible filtered search for 
multiple segments)

 [suggest] Near real time Document Suggester
 ---

 Key: LUCENE-6339
 URL: https://issues.apache.org/jira/browse/LUCENE-6339
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 5.0
Reporter: Areek Zillur
Assignee: Areek Zillur
 Fix For: Trunk, 5.1

 Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, 
 LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch


 The idea is to index documents with one or more *SuggestField*(s) and be able 
 to suggest documents with a *SuggestField* value that matches a given key.
 A SuggestField can be assigned a numeric weight to be used to score the 
 suggestion at query time.
 Document suggestion can be done on an indexed *SuggestField*. The document 
 suggester can filter out deleted documents in near real-time. The suggester 
 can filter out documents based on a Filter (note: may change to a non-scoring 
 query?) at query time.
 A custom postings format (CompletionPostingsFormat) is used to index 
 SuggestField(s) and perform document suggestions.
 h4. Usage
 {code:java}
   // hook up custom postings format
   // indexAnalyzer for SuggestField
   Analyzer analyzer = ...
   IndexWriterConfig config = new IndexWriterConfig(analyzer);
   Codec codec = new Lucene50Codec() {
 PostingsFormat completionPostingsFormat = new 
 Completion50PostingsFormat();
 @Override
 public PostingsFormat getPostingsFormatForField(String field) {
   if (isSuggestField(field)) {
 return completionPostingsFormat;
   }
   return super.getPostingsFormatForField(field);
 }
   };
   config.setCodec(codec);
   IndexWriter writer = new IndexWriter(dir, config);
   // index some documents with suggestions
   Document doc = new Document();
   doc.add(new SuggestField(suggest_title, title1, 2));
   doc.add(new SuggestField(suggest_name, name1, 3));
   writer.addDocument(doc)
   ...
   // open an nrt reader for the directory
   DirectoryReader reader = DirectoryReader.open(writer, false);
   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
   // queryAnalyzer will be used to analyze the query string
   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
 queryAnalyzer);
   
   // suggest 10 documents for titl on suggest_title field
   TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10);
 {code}
 h4. Indexing
 Index analyzer set through *IndexWriterConfig*
 {code:java}
 SuggestField(String name, String value, long weight) 
 {code}
 h4. Query
 Query analyzer set through *SuggestIndexSearcher*.
 Hits are collected in descending order of the suggestion's weight 
 {code:java}
 // full options for TopSuggestDocs (TopDocs)
 TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)
 // full options for Collector
 // note: only collects does not score
 void suggest(String field, CharSequence key, int num, Filter filter, 
 TopSuggestDocsCollector collector) 
 {code}
 h4. Analyzer
 *CompletionAnalyzer* can be used instead to wrap another analyzer to tune 
 suggest field only parameters. 
 {code:java}
 CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean 
 preservePositionIncrements, int maxGraphExpansions)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-04-07 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483735#comment-14483735
 ] 

ASF subversion and git services commented on LUCENE-6339:
-

Commit 1671915 from [~areek] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1671915 ]

LUCENE-6339: fix test (take into account inadmissible filtered search for 
multiple segments)

 [suggest] Near real time Document Suggester
 ---

 Key: LUCENE-6339
 URL: https://issues.apache.org/jira/browse/LUCENE-6339
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 5.0
Reporter: Areek Zillur
Assignee: Areek Zillur
 Fix For: Trunk, 5.1

 Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, 
 LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch


 The idea is to index documents with one or more *SuggestField*(s) and be able 
 to suggest documents with a *SuggestField* value that matches a given key.
 A SuggestField can be assigned a numeric weight to be used to score the 
 suggestion at query time.
 Document suggestion can be done on an indexed *SuggestField*. The document 
 suggester can filter out deleted documents in near real-time. The suggester 
 can filter out documents based on a Filter (note: may change to a non-scoring 
 query?) at query time.
 A custom postings format (CompletionPostingsFormat) is used to index 
 SuggestField(s) and perform document suggestions.
 h4. Usage
 {code:java}
   // hook up custom postings format
   // indexAnalyzer for SuggestField
   Analyzer analyzer = ...
   IndexWriterConfig config = new IndexWriterConfig(analyzer);
   Codec codec = new Lucene50Codec() {
 PostingsFormat completionPostingsFormat = new 
 Completion50PostingsFormat();
 @Override
 public PostingsFormat getPostingsFormatForField(String field) {
   if (isSuggestField(field)) {
 return completionPostingsFormat;
   }
   return super.getPostingsFormatForField(field);
 }
   };
   config.setCodec(codec);
   IndexWriter writer = new IndexWriter(dir, config);
   // index some documents with suggestions
   Document doc = new Document();
   doc.add(new SuggestField(suggest_title, title1, 2));
   doc.add(new SuggestField(suggest_name, name1, 3));
   writer.addDocument(doc)
   ...
   // open an nrt reader for the directory
   DirectoryReader reader = DirectoryReader.open(writer, false);
   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
   // queryAnalyzer will be used to analyze the query string
   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
 queryAnalyzer);
   
   // suggest 10 documents for titl on suggest_title field
   TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10);
 {code}
 h4. Indexing
 Index analyzer set through *IndexWriterConfig*
 {code:java}
 SuggestField(String name, String value, long weight) 
 {code}
 h4. Query
 Query analyzer set through *SuggestIndexSearcher*.
 Hits are collected in descending order of the suggestion's weight 
 {code:java}
 // full options for TopSuggestDocs (TopDocs)
 TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)
 // full options for Collector
 // note: only collects does not score
 void suggest(String field, CharSequence key, int num, Filter filter, 
 TopSuggestDocsCollector collector) 
 {code}
 h4. Analyzer
 *CompletionAnalyzer* can be used instead to wrap another analyzer to tune 
 suggest field only parameters. 
 {code:java}
 CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean 
 preservePositionIncrements, int maxGraphExpansions)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-04-03 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395194#comment-14395194
 ] 

ASF subversion and git services commented on LUCENE-6339:
-

Commit 1671196 from [~areek] in branch 'dev/trunk'
[ https://svn.apache.org/r1671196 ]

LUCENE-6339: fix test (ensure the maximum requested size is bounded to 1000)

 [suggest] Near real time Document Suggester
 ---

 Key: LUCENE-6339
 URL: https://issues.apache.org/jira/browse/LUCENE-6339
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 5.0
Reporter: Areek Zillur
Assignee: Areek Zillur
 Fix For: Trunk, 5.1

 Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, 
 LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch


 The idea is to index documents with one or more *SuggestField*(s) and be able 
 to suggest documents with a *SuggestField* value that matches a given key.
 A SuggestField can be assigned a numeric weight to be used to score the 
 suggestion at query time.
 Document suggestion can be done on an indexed *SuggestField*. The document 
 suggester can filter out deleted documents in near real-time. The suggester 
 can filter out documents based on a Filter (note: may change to a non-scoring 
 query?) at query time.
 A custom postings format (CompletionPostingsFormat) is used to index 
 SuggestField(s) and perform document suggestions.
 h4. Usage
 {code:java}
   // hook up custom postings format
   // indexAnalyzer for SuggestField
   Analyzer analyzer = ...
   IndexWriterConfig config = new IndexWriterConfig(analyzer);
   Codec codec = new Lucene50Codec() {
 PostingsFormat completionPostingsFormat = new 
 Completion50PostingsFormat();
 @Override
 public PostingsFormat getPostingsFormatForField(String field) {
   if (isSuggestField(field)) {
 return completionPostingsFormat;
   }
   return super.getPostingsFormatForField(field);
 }
   };
   config.setCodec(codec);
   IndexWriter writer = new IndexWriter(dir, config);
   // index some documents with suggestions
   Document doc = new Document();
   doc.add(new SuggestField(suggest_title, title1, 2));
   doc.add(new SuggestField(suggest_name, name1, 3));
   writer.addDocument(doc)
   ...
   // open an nrt reader for the directory
   DirectoryReader reader = DirectoryReader.open(writer, false);
   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
   // queryAnalyzer will be used to analyze the query string
   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
 queryAnalyzer);
   
   // suggest 10 documents for titl on suggest_title field
   TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10);
 {code}
 h4. Indexing
 Index analyzer set through *IndexWriterConfig*
 {code:java}
 SuggestField(String name, String value, long weight) 
 {code}
 h4. Query
 Query analyzer set through *SuggestIndexSearcher*.
 Hits are collected in descending order of the suggestion's weight 
 {code:java}
 // full options for TopSuggestDocs (TopDocs)
 TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)
 // full options for Collector
 // note: only collects does not score
 void suggest(String field, CharSequence key, int num, Filter filter, 
 TopSuggestDocsCollector collector) 
 {code}
 h4. Analyzer
 *CompletionAnalyzer* can be used instead to wrap another analyzer to tune 
 suggest field only parameters. 
 {code:java}
 CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean 
 preservePositionIncrements, int maxGraphExpansions)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-04-03 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395174#comment-14395174
 ] 

ASF subversion and git services commented on LUCENE-6339:
-

Commit 1671187 from [~areek] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1671187 ]

LUCENE-6339: fix test (ensure the maximum requested size is bounded to 1000)

 [suggest] Near real time Document Suggester
 ---

 Key: LUCENE-6339
 URL: https://issues.apache.org/jira/browse/LUCENE-6339
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 5.0
Reporter: Areek Zillur
Assignee: Areek Zillur
 Fix For: Trunk, 5.1

 Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, 
 LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch


 The idea is to index documents with one or more *SuggestField*(s) and be able 
 to suggest documents with a *SuggestField* value that matches a given key.
 A SuggestField can be assigned a numeric weight to be used to score the 
 suggestion at query time.
 Document suggestion can be done on an indexed *SuggestField*. The document 
 suggester can filter out deleted documents in near real-time. The suggester 
 can filter out documents based on a Filter (note: may change to a non-scoring 
 query?) at query time.
 A custom postings format (CompletionPostingsFormat) is used to index 
 SuggestField(s) and perform document suggestions.
 h4. Usage
 {code:java}
   // hook up custom postings format
   // indexAnalyzer for SuggestField
   Analyzer analyzer = ...
   IndexWriterConfig config = new IndexWriterConfig(analyzer);
   Codec codec = new Lucene50Codec() {
 PostingsFormat completionPostingsFormat = new 
 Completion50PostingsFormat();
 @Override
 public PostingsFormat getPostingsFormatForField(String field) {
   if (isSuggestField(field)) {
 return completionPostingsFormat;
   }
   return super.getPostingsFormatForField(field);
 }
   };
   config.setCodec(codec);
   IndexWriter writer = new IndexWriter(dir, config);
   // index some documents with suggestions
   Document doc = new Document();
   doc.add(new SuggestField(suggest_title, title1, 2));
   doc.add(new SuggestField(suggest_name, name1, 3));
   writer.addDocument(doc)
   ...
   // open an nrt reader for the directory
   DirectoryReader reader = DirectoryReader.open(writer, false);
   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
   // queryAnalyzer will be used to analyze the query string
   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
 queryAnalyzer);
   
   // suggest 10 documents for titl on suggest_title field
   TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10);
 {code}
 h4. Indexing
 Index analyzer set through *IndexWriterConfig*
 {code:java}
 SuggestField(String name, String value, long weight) 
 {code}
 h4. Query
 Query analyzer set through *SuggestIndexSearcher*.
 Hits are collected in descending order of the suggestion's weight 
 {code:java}
 // full options for TopSuggestDocs (TopDocs)
 TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)
 // full options for Collector
 // note: only collects does not score
 void suggest(String field, CharSequence key, int num, Filter filter, 
 TopSuggestDocsCollector collector) 
 {code}
 h4. Analyzer
 *CompletionAnalyzer* can be used instead to wrap another analyzer to tune 
 suggest field only parameters. 
 {code:java}
 CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean 
 preservePositionIncrements, int maxGraphExpansions)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-04-03 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395176#comment-14395176
 ] 

ASF subversion and git services commented on LUCENE-6339:
-

Commit 1671189 from [~areek] in branch 'dev/branches/lucene_solr_5_1'
[ https://svn.apache.org/r1671189 ]

LUCENE-6339: fix test (ensure the maximum requested size is bounded to 1000)

 [suggest] Near real time Document Suggester
 ---

 Key: LUCENE-6339
 URL: https://issues.apache.org/jira/browse/LUCENE-6339
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 5.0
Reporter: Areek Zillur
Assignee: Areek Zillur
 Fix For: Trunk, 5.1

 Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, 
 LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch


 The idea is to index documents with one or more *SuggestField*(s) and be able 
 to suggest documents with a *SuggestField* value that matches a given key.
 A SuggestField can be assigned a numeric weight to be used to score the 
 suggestion at query time.
 Document suggestion can be done on an indexed *SuggestField*. The document 
 suggester can filter out deleted documents in near real-time. The suggester 
 can filter out documents based on a Filter (note: may change to a non-scoring 
 query?) at query time.
 A custom postings format (CompletionPostingsFormat) is used to index 
 SuggestField(s) and perform document suggestions.
 h4. Usage
 {code:java}
   // hook up custom postings format
   // indexAnalyzer for SuggestField
   Analyzer analyzer = ...
   IndexWriterConfig config = new IndexWriterConfig(analyzer);
   Codec codec = new Lucene50Codec() {
 PostingsFormat completionPostingsFormat = new 
 Completion50PostingsFormat();
 @Override
 public PostingsFormat getPostingsFormatForField(String field) {
   if (isSuggestField(field)) {
 return completionPostingsFormat;
   }
   return super.getPostingsFormatForField(field);
 }
   };
   config.setCodec(codec);
   IndexWriter writer = new IndexWriter(dir, config);
   // index some documents with suggestions
   Document doc = new Document();
   doc.add(new SuggestField(suggest_title, title1, 2));
   doc.add(new SuggestField(suggest_name, name1, 3));
   writer.addDocument(doc)
   ...
   // open an nrt reader for the directory
   DirectoryReader reader = DirectoryReader.open(writer, false);
   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
   // queryAnalyzer will be used to analyze the query string
   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
 queryAnalyzer);
   
   // suggest 10 documents for titl on suggest_title field
   TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10);
 {code}
 h4. Indexing
 Index analyzer set through *IndexWriterConfig*
 {code:java}
 SuggestField(String name, String value, long weight) 
 {code}
 h4. Query
 Query analyzer set through *SuggestIndexSearcher*.
 Hits are collected in descending order of the suggestion's weight 
 {code:java}
 // full options for TopSuggestDocs (TopDocs)
 TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)
 // full options for Collector
 // note: only collects does not score
 void suggest(String field, CharSequence key, int num, Filter filter, 
 TopSuggestDocsCollector collector) 
 {code}
 h4. Analyzer
 *CompletionAnalyzer* can be used instead to wrap another analyzer to tune 
 suggest field only parameters. 
 {code:java}
 CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean 
 preservePositionIncrements, int maxGraphExpansions)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-04-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393425#comment-14393425
 ] 

ASF subversion and git services commented on LUCENE-6339:
-

Commit 1670969 from [~areek] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1670969 ]

LUCENE-6339: fix test bug (ensure opening nrt reader with applyAllDeletes)

 [suggest] Near real time Document Suggester
 ---

 Key: LUCENE-6339
 URL: https://issues.apache.org/jira/browse/LUCENE-6339
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 5.0
Reporter: Areek Zillur
Assignee: Areek Zillur
 Fix For: Trunk, 5.x

 Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, 
 LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch


 The idea is to index documents with one or more *SuggestField*(s) and be able 
 to suggest documents with a *SuggestField* value that matches a given key.
 A SuggestField can be assigned a numeric weight to be used to score the 
 suggestion at query time.
 Document suggestion can be done on an indexed *SuggestField*. The document 
 suggester can filter out deleted documents in near real-time. The suggester 
 can filter out documents based on a Filter (note: may change to a non-scoring 
 query?) at query time.
 A custom postings format (CompletionPostingsFormat) is used to index 
 SuggestField(s) and perform document suggestions.
 h4. Usage
 {code:java}
   // hook up custom postings format
   // indexAnalyzer for SuggestField
   Analyzer analyzer = ...
   IndexWriterConfig config = new IndexWriterConfig(analyzer);
   Codec codec = new Lucene50Codec() {
 PostingsFormat completionPostingsFormat = new 
 Completion50PostingsFormat();
 @Override
 public PostingsFormat getPostingsFormatForField(String field) {
   if (isSuggestField(field)) {
 return completionPostingsFormat;
   }
   return super.getPostingsFormatForField(field);
 }
   };
   config.setCodec(codec);
   IndexWriter writer = new IndexWriter(dir, config);
   // index some documents with suggestions
   Document doc = new Document();
   doc.add(new SuggestField(suggest_title, title1, 2));
   doc.add(new SuggestField(suggest_name, name1, 3));
   writer.addDocument(doc)
   ...
   // open an nrt reader for the directory
   DirectoryReader reader = DirectoryReader.open(writer, false);
   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
   // queryAnalyzer will be used to analyze the query string
   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
 queryAnalyzer);
   
   // suggest 10 documents for titl on suggest_title field
   TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10);
 {code}
 h4. Indexing
 Index analyzer set through *IndexWriterConfig*
 {code:java}
 SuggestField(String name, String value, long weight) 
 {code}
 h4. Query
 Query analyzer set through *SuggestIndexSearcher*.
 Hits are collected in descending order of the suggestion's weight 
 {code:java}
 // full options for TopSuggestDocs (TopDocs)
 TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)
 // full options for Collector
 // note: only collects does not score
 void suggest(String field, CharSequence key, int num, Filter filter, 
 TopSuggestDocsCollector collector) 
 {code}
 h4. Analyzer
 *CompletionAnalyzer* can be used instead to wrap another analyzer to tune 
 suggest field only parameters. 
 {code:java}
 CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean 
 preservePositionIncrements, int maxGraphExpansions)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-04-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393474#comment-14393474
 ] 

ASF subversion and git services commented on LUCENE-6339:
-

Commit 1670978 from [~areek] in branch 'dev/branches/lucene_solr_5_1'
[ https://svn.apache.org/r1670978 ]

LUCENE-6339: fix test bug (ensure opening nrt reader with applyAllDeletes)

 [suggest] Near real time Document Suggester
 ---

 Key: LUCENE-6339
 URL: https://issues.apache.org/jira/browse/LUCENE-6339
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 5.0
Reporter: Areek Zillur
Assignee: Areek Zillur
 Fix For: Trunk, 5.x

 Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, 
 LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch


 The idea is to index documents with one or more *SuggestField*(s) and be able 
 to suggest documents with a *SuggestField* value that matches a given key.
 A SuggestField can be assigned a numeric weight to be used to score the 
 suggestion at query time.
 Document suggestion can be done on an indexed *SuggestField*. The document 
 suggester can filter out deleted documents in near real-time. The suggester 
 can filter out documents based on a Filter (note: may change to a non-scoring 
 query?) at query time.
 A custom postings format (CompletionPostingsFormat) is used to index 
 SuggestField(s) and perform document suggestions.
 h4. Usage
 {code:java}
   // hook up custom postings format
   // indexAnalyzer for SuggestField
   Analyzer analyzer = ...
   IndexWriterConfig config = new IndexWriterConfig(analyzer);
   Codec codec = new Lucene50Codec() {
 PostingsFormat completionPostingsFormat = new 
 Completion50PostingsFormat();
 @Override
 public PostingsFormat getPostingsFormatForField(String field) {
   if (isSuggestField(field)) {
 return completionPostingsFormat;
   }
   return super.getPostingsFormatForField(field);
 }
   };
   config.setCodec(codec);
   IndexWriter writer = new IndexWriter(dir, config);
   // index some documents with suggestions
   Document doc = new Document();
   doc.add(new SuggestField(suggest_title, title1, 2));
   doc.add(new SuggestField(suggest_name, name1, 3));
   writer.addDocument(doc)
   ...
   // open an nrt reader for the directory
   DirectoryReader reader = DirectoryReader.open(writer, false);
   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
   // queryAnalyzer will be used to analyze the query string
   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
 queryAnalyzer);
   
   // suggest 10 documents for titl on suggest_title field
   TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10);
 {code}
 h4. Indexing
 Index analyzer set through *IndexWriterConfig*
 {code:java}
 SuggestField(String name, String value, long weight) 
 {code}
 h4. Query
 Query analyzer set through *SuggestIndexSearcher*.
 Hits are collected in descending order of the suggestion's weight 
 {code:java}
 // full options for TopSuggestDocs (TopDocs)
 TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)
 // full options for Collector
 // note: only collects does not score
 void suggest(String field, CharSequence key, int num, Filter filter, 
 TopSuggestDocsCollector collector) 
 {code}
 h4. Analyzer
 *CompletionAnalyzer* can be used instead to wrap another analyzer to tune 
 suggest field only parameters. 
 {code:java}
 CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean 
 preservePositionIncrements, int maxGraphExpansions)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-04-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393438#comment-14393438
 ] 

ASF subversion and git services commented on LUCENE-6339:
-

Commit 1670972 from [~areek] in branch 'dev/trunk'
[ https://svn.apache.org/r1670972 ]

LUCENE-6339: fix test bug (ensure opening nrt reader with applyAllDeletes)

 [suggest] Near real time Document Suggester
 ---

 Key: LUCENE-6339
 URL: https://issues.apache.org/jira/browse/LUCENE-6339
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 5.0
Reporter: Areek Zillur
Assignee: Areek Zillur
 Fix For: Trunk, 5.x

 Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, 
 LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch


 The idea is to index documents with one or more *SuggestField*(s) and be able 
 to suggest documents with a *SuggestField* value that matches a given key.
 A SuggestField can be assigned a numeric weight to be used to score the 
 suggestion at query time.
 Document suggestion can be done on an indexed *SuggestField*. The document 
 suggester can filter out deleted documents in near real-time. The suggester 
 can filter out documents based on a Filter (note: may change to a non-scoring 
 query?) at query time.
 A custom postings format (CompletionPostingsFormat) is used to index 
 SuggestField(s) and perform document suggestions.
 h4. Usage
 {code:java}
   // hook up custom postings format
   // indexAnalyzer for SuggestField
   Analyzer analyzer = ...
   IndexWriterConfig config = new IndexWriterConfig(analyzer);
   Codec codec = new Lucene50Codec() {
 PostingsFormat completionPostingsFormat = new 
 Completion50PostingsFormat();
 @Override
 public PostingsFormat getPostingsFormatForField(String field) {
   if (isSuggestField(field)) {
 return completionPostingsFormat;
   }
   return super.getPostingsFormatForField(field);
 }
   };
   config.setCodec(codec);
   IndexWriter writer = new IndexWriter(dir, config);
   // index some documents with suggestions
   Document doc = new Document();
   doc.add(new SuggestField(suggest_title, title1, 2));
   doc.add(new SuggestField(suggest_name, name1, 3));
   writer.addDocument(doc)
   ...
   // open an nrt reader for the directory
   DirectoryReader reader = DirectoryReader.open(writer, false);
   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
   // queryAnalyzer will be used to analyze the query string
   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
 queryAnalyzer);
   
   // suggest 10 documents for titl on suggest_title field
   TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10);
 {code}
 h4. Indexing
 Index analyzer set through *IndexWriterConfig*
 {code:java}
 SuggestField(String name, String value, long weight) 
 {code}
 h4. Query
 Query analyzer set through *SuggestIndexSearcher*.
 Hits are collected in descending order of the suggestion's weight 
 {code:java}
 // full options for TopSuggestDocs (TopDocs)
 TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)
 // full options for Collector
 // note: only collects does not score
 void suggest(String field, CharSequence key, int num, Filter filter, 
 TopSuggestDocsCollector collector) 
 {code}
 h4. Analyzer
 *CompletionAnalyzer* can be used instead to wrap another analyzer to tune 
 suggest field only parameters. 
 {code:java}
 CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean 
 preservePositionIncrements, int maxGraphExpansions)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-03-27 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384405#comment-14384405
 ] 

Michael McCandless commented on LUCENE-6339:


I think the tie break should be a.doc  b.doc, for consistency with Lucene?

I.e., on a score tie, the smaller doc ID should sorter earlier than the bigger 
doc ID?

Otherwise +1 to commit!  Thanks [~areek]!

 [suggest] Near real time Document Suggester
 ---

 Key: LUCENE-6339
 URL: https://issues.apache.org/jira/browse/LUCENE-6339
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 5.0
Reporter: Areek Zillur
Assignee: Areek Zillur
 Fix For: 5.0

 Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, 
 LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch


 The idea is to index documents with one or more *SuggestField*(s) and be able 
 to suggest documents with a *SuggestField* value that matches a given key.
 A SuggestField can be assigned a numeric weight to be used to score the 
 suggestion at query time.
 Document suggestion can be done on an indexed *SuggestField*. The document 
 suggester can filter out deleted documents in near real-time. The suggester 
 can filter out documents based on a Filter (note: may change to a non-scoring 
 query?) at query time.
 A custom postings format (CompletionPostingsFormat) is used to index 
 SuggestField(s) and perform document suggestions.
 h4. Usage
 {code:java}
   // hook up custom postings format
   // indexAnalyzer for SuggestField
   Analyzer analyzer = ...
   IndexWriterConfig config = new IndexWriterConfig(analyzer);
   Codec codec = new Lucene50Codec() {
 PostingsFormat completionPostingsFormat = new 
 Completion50PostingsFormat();
 @Override
 public PostingsFormat getPostingsFormatForField(String field) {
   if (isSuggestField(field)) {
 return completionPostingsFormat;
   }
   return super.getPostingsFormatForField(field);
 }
   };
   config.setCodec(codec);
   IndexWriter writer = new IndexWriter(dir, config);
   // index some documents with suggestions
   Document doc = new Document();
   doc.add(new SuggestField(suggest_title, title1, 2));
   doc.add(new SuggestField(suggest_name, name1, 3));
   writer.addDocument(doc)
   ...
   // open an nrt reader for the directory
   DirectoryReader reader = DirectoryReader.open(writer, false);
   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
   // queryAnalyzer will be used to analyze the query string
   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
 queryAnalyzer);
   
   // suggest 10 documents for titl on suggest_title field
   TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10);
 {code}
 h4. Indexing
 Index analyzer set through *IndexWriterConfig*
 {code:java}
 SuggestField(String name, String value, long weight) 
 {code}
 h4. Query
 Query analyzer set through *SuggestIndexSearcher*.
 Hits are collected in descending order of the suggestion's weight 
 {code:java}
 // full options for TopSuggestDocs (TopDocs)
 TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)
 // full options for Collector
 // note: only collects does not score
 void suggest(String field, CharSequence key, int num, Filter filter, 
 TopSuggestDocsCollector collector) 
 {code}
 h4. Analyzer
 *CompletionAnalyzer* can be used instead to wrap another analyzer to tune 
 suggest field only parameters. 
 {code:java}
 CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean 
 preservePositionIncrements, int maxGraphExpansions)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-03-27 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384583#comment-14384583
 ] 

Uwe Schindler commented on LUCENE-6339:
---

I just reviewed the patch, too. I like the API, but have not yet looked into it 
closely like Mike - I just skimmed it.

Just one question: What happens if 2 documents have the same SuggestField and 
same suggestion presented to user? This would now produce duplicates, right? I 
was just thinking about how to prevent that (coming from Elasticsearch world).

 [suggest] Near real time Document Suggester
 ---

 Key: LUCENE-6339
 URL: https://issues.apache.org/jira/browse/LUCENE-6339
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 5.0
Reporter: Areek Zillur
Assignee: Areek Zillur
 Fix For: 5.0

 Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, 
 LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch


 The idea is to index documents with one or more *SuggestField*(s) and be able 
 to suggest documents with a *SuggestField* value that matches a given key.
 A SuggestField can be assigned a numeric weight to be used to score the 
 suggestion at query time.
 Document suggestion can be done on an indexed *SuggestField*. The document 
 suggester can filter out deleted documents in near real-time. The suggester 
 can filter out documents based on a Filter (note: may change to a non-scoring 
 query?) at query time.
 A custom postings format (CompletionPostingsFormat) is used to index 
 SuggestField(s) and perform document suggestions.
 h4. Usage
 {code:java}
   // hook up custom postings format
   // indexAnalyzer for SuggestField
   Analyzer analyzer = ...
   IndexWriterConfig config = new IndexWriterConfig(analyzer);
   Codec codec = new Lucene50Codec() {
 PostingsFormat completionPostingsFormat = new 
 Completion50PostingsFormat();
 @Override
 public PostingsFormat getPostingsFormatForField(String field) {
   if (isSuggestField(field)) {
 return completionPostingsFormat;
   }
   return super.getPostingsFormatForField(field);
 }
   };
   config.setCodec(codec);
   IndexWriter writer = new IndexWriter(dir, config);
   // index some documents with suggestions
   Document doc = new Document();
   doc.add(new SuggestField(suggest_title, title1, 2));
   doc.add(new SuggestField(suggest_name, name1, 3));
   writer.addDocument(doc)
   ...
   // open an nrt reader for the directory
   DirectoryReader reader = DirectoryReader.open(writer, false);
   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
   // queryAnalyzer will be used to analyze the query string
   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
 queryAnalyzer);
   
   // suggest 10 documents for titl on suggest_title field
   TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10);
 {code}
 h4. Indexing
 Index analyzer set through *IndexWriterConfig*
 {code:java}
 SuggestField(String name, String value, long weight) 
 {code}
 h4. Query
 Query analyzer set through *SuggestIndexSearcher*.
 Hits are collected in descending order of the suggestion's weight 
 {code:java}
 // full options for TopSuggestDocs (TopDocs)
 TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)
 // full options for Collector
 // note: only collects does not score
 void suggest(String field, CharSequence key, int num, Filter filter, 
 TopSuggestDocsCollector collector) 
 {code}
 h4. Analyzer
 *CompletionAnalyzer* can be used instead to wrap another analyzer to tune 
 suggest field only parameters. 
 {code:java}
 CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean 
 preservePositionIncrements, int maxGraphExpansions)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-03-27 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384604#comment-14384604
 ] 

Uwe Schindler commented on LUCENE-6339:
---

+1

 [suggest] Near real time Document Suggester
 ---

 Key: LUCENE-6339
 URL: https://issues.apache.org/jira/browse/LUCENE-6339
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 5.0
Reporter: Areek Zillur
Assignee: Areek Zillur
 Fix For: 5.0

 Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, 
 LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch


 The idea is to index documents with one or more *SuggestField*(s) and be able 
 to suggest documents with a *SuggestField* value that matches a given key.
 A SuggestField can be assigned a numeric weight to be used to score the 
 suggestion at query time.
 Document suggestion can be done on an indexed *SuggestField*. The document 
 suggester can filter out deleted documents in near real-time. The suggester 
 can filter out documents based on a Filter (note: may change to a non-scoring 
 query?) at query time.
 A custom postings format (CompletionPostingsFormat) is used to index 
 SuggestField(s) and perform document suggestions.
 h4. Usage
 {code:java}
   // hook up custom postings format
   // indexAnalyzer for SuggestField
   Analyzer analyzer = ...
   IndexWriterConfig config = new IndexWriterConfig(analyzer);
   Codec codec = new Lucene50Codec() {
 PostingsFormat completionPostingsFormat = new 
 Completion50PostingsFormat();
 @Override
 public PostingsFormat getPostingsFormatForField(String field) {
   if (isSuggestField(field)) {
 return completionPostingsFormat;
   }
   return super.getPostingsFormatForField(field);
 }
   };
   config.setCodec(codec);
   IndexWriter writer = new IndexWriter(dir, config);
   // index some documents with suggestions
   Document doc = new Document();
   doc.add(new SuggestField(suggest_title, title1, 2));
   doc.add(new SuggestField(suggest_name, name1, 3));
   writer.addDocument(doc)
   ...
   // open an nrt reader for the directory
   DirectoryReader reader = DirectoryReader.open(writer, false);
   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
   // queryAnalyzer will be used to analyze the query string
   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
 queryAnalyzer);
   
   // suggest 10 documents for titl on suggest_title field
   TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10);
 {code}
 h4. Indexing
 Index analyzer set through *IndexWriterConfig*
 {code:java}
 SuggestField(String name, String value, long weight) 
 {code}
 h4. Query
 Query analyzer set through *SuggestIndexSearcher*.
 Hits are collected in descending order of the suggestion's weight 
 {code:java}
 // full options for TopSuggestDocs (TopDocs)
 TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)
 // full options for Collector
 // note: only collects does not score
 void suggest(String field, CharSequence key, int num, Filter filter, 
 TopSuggestDocsCollector collector) 
 {code}
 h4. Analyzer
 *CompletionAnalyzer* can be used instead to wrap another analyzer to tune 
 suggest field only parameters. 
 {code:java}
 CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean 
 preservePositionIncrements, int maxGraphExpansions)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-03-27 Thread Areek Zillur (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384631#comment-14384631
 ] 

Areek Zillur commented on LUCENE-6339:
--

Hi [~thetaphi],
Thanks for the review!
If two documents do have the same suggestion for the same SuggestField, it will 
produce duplicates in terms of the suggestion, but because they are from two 
documents (different doc ids) they are not considered as duplicates.
Maybe we can add a boolean flag in the NRTSuggester to only collect unique 
suggestions, but then we will have to decide on which suggestion to throw out, 
as they are now tied to doc ids?


 [suggest] Near real time Document Suggester
 ---

 Key: LUCENE-6339
 URL: https://issues.apache.org/jira/browse/LUCENE-6339
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 5.0
Reporter: Areek Zillur
Assignee: Areek Zillur
 Fix For: 5.0

 Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, 
 LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch


 The idea is to index documents with one or more *SuggestField*(s) and be able 
 to suggest documents with a *SuggestField* value that matches a given key.
 A SuggestField can be assigned a numeric weight to be used to score the 
 suggestion at query time.
 Document suggestion can be done on an indexed *SuggestField*. The document 
 suggester can filter out deleted documents in near real-time. The suggester 
 can filter out documents based on a Filter (note: may change to a non-scoring 
 query?) at query time.
 A custom postings format (CompletionPostingsFormat) is used to index 
 SuggestField(s) and perform document suggestions.
 h4. Usage
 {code:java}
   // hook up custom postings format
   // indexAnalyzer for SuggestField
   Analyzer analyzer = ...
   IndexWriterConfig config = new IndexWriterConfig(analyzer);
   Codec codec = new Lucene50Codec() {
 PostingsFormat completionPostingsFormat = new 
 Completion50PostingsFormat();
 @Override
 public PostingsFormat getPostingsFormatForField(String field) {
   if (isSuggestField(field)) {
 return completionPostingsFormat;
   }
   return super.getPostingsFormatForField(field);
 }
   };
   config.setCodec(codec);
   IndexWriter writer = new IndexWriter(dir, config);
   // index some documents with suggestions
   Document doc = new Document();
   doc.add(new SuggestField(suggest_title, title1, 2));
   doc.add(new SuggestField(suggest_name, name1, 3));
   writer.addDocument(doc)
   ...
   // open an nrt reader for the directory
   DirectoryReader reader = DirectoryReader.open(writer, false);
   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
   // queryAnalyzer will be used to analyze the query string
   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
 queryAnalyzer);
   
   // suggest 10 documents for titl on suggest_title field
   TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10);
 {code}
 h4. Indexing
 Index analyzer set through *IndexWriterConfig*
 {code:java}
 SuggestField(String name, String value, long weight) 
 {code}
 h4. Query
 Query analyzer set through *SuggestIndexSearcher*.
 Hits are collected in descending order of the suggestion's weight 
 {code:java}
 // full options for TopSuggestDocs (TopDocs)
 TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)
 // full options for Collector
 // note: only collects does not score
 void suggest(String field, CharSequence key, int num, Filter filter, 
 TopSuggestDocsCollector collector) 
 {code}
 h4. Analyzer
 *CompletionAnalyzer* can be used instead to wrap another analyzer to tune 
 suggest field only parameters. 
 {code:java}
 CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean 
 preservePositionIncrements, int maxGraphExpansions)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-03-27 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384823#comment-14384823
 ] 

ASF subversion and git services commented on LUCENE-6339:
-

Commit 1669698 from [~areek] in branch 'dev/trunk'
[ https://svn.apache.org/r1669698 ]

LUCENE-6339: Added Near-real time Document Suggester via custom postings format

 [suggest] Near real time Document Suggester
 ---

 Key: LUCENE-6339
 URL: https://issues.apache.org/jira/browse/LUCENE-6339
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 5.0
Reporter: Areek Zillur
Assignee: Areek Zillur
 Fix For: 5.0

 Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, 
 LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch


 The idea is to index documents with one or more *SuggestField*(s) and be able 
 to suggest documents with a *SuggestField* value that matches a given key.
 A SuggestField can be assigned a numeric weight to be used to score the 
 suggestion at query time.
 Document suggestion can be done on an indexed *SuggestField*. The document 
 suggester can filter out deleted documents in near real-time. The suggester 
 can filter out documents based on a Filter (note: may change to a non-scoring 
 query?) at query time.
 A custom postings format (CompletionPostingsFormat) is used to index 
 SuggestField(s) and perform document suggestions.
 h4. Usage
 {code:java}
   // hook up custom postings format
   // indexAnalyzer for SuggestField
   Analyzer analyzer = ...
   IndexWriterConfig config = new IndexWriterConfig(analyzer);
   Codec codec = new Lucene50Codec() {
 PostingsFormat completionPostingsFormat = new 
 Completion50PostingsFormat();
 @Override
 public PostingsFormat getPostingsFormatForField(String field) {
   if (isSuggestField(field)) {
 return completionPostingsFormat;
   }
   return super.getPostingsFormatForField(field);
 }
   };
   config.setCodec(codec);
   IndexWriter writer = new IndexWriter(dir, config);
   // index some documents with suggestions
   Document doc = new Document();
   doc.add(new SuggestField(suggest_title, title1, 2));
   doc.add(new SuggestField(suggest_name, name1, 3));
   writer.addDocument(doc)
   ...
   // open an nrt reader for the directory
   DirectoryReader reader = DirectoryReader.open(writer, false);
   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
   // queryAnalyzer will be used to analyze the query string
   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
 queryAnalyzer);
   
   // suggest 10 documents for titl on suggest_title field
   TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10);
 {code}
 h4. Indexing
 Index analyzer set through *IndexWriterConfig*
 {code:java}
 SuggestField(String name, String value, long weight) 
 {code}
 h4. Query
 Query analyzer set through *SuggestIndexSearcher*.
 Hits are collected in descending order of the suggestion's weight 
 {code:java}
 // full options for TopSuggestDocs (TopDocs)
 TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)
 // full options for Collector
 // note: only collects does not score
 void suggest(String field, CharSequence key, int num, Filter filter, 
 TopSuggestDocsCollector collector) 
 {code}
 h4. Analyzer
 *CompletionAnalyzer* can be used instead to wrap another analyzer to tune 
 suggest field only parameters. 
 {code:java}
 CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean 
 preservePositionIncrements, int maxGraphExpansions)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-03-27 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384807#comment-14384807
 ] 

Uwe Schindler commented on LUCENE-6339:
---

bq. If two documents do have the same suggestion for the same SuggestField, it 
will produce duplicates in terms of the suggestion, but because they are from 
two documents (different doc ids) they are not considered as duplicates.

Yeah that's what I mean by duplicate. The suggester only returns doc ids. Vor 
display to user, you would read a stored field (the actual suggestion) and this 
produces the duplicate. I am not sure how to solve that. It was just an idea. 
If this is really an issue, one could filter the duplicates afterwards.

 [suggest] Near real time Document Suggester
 ---

 Key: LUCENE-6339
 URL: https://issues.apache.org/jira/browse/LUCENE-6339
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 5.0
Reporter: Areek Zillur
Assignee: Areek Zillur
 Fix For: 5.0

 Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, 
 LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch


 The idea is to index documents with one or more *SuggestField*(s) and be able 
 to suggest documents with a *SuggestField* value that matches a given key.
 A SuggestField can be assigned a numeric weight to be used to score the 
 suggestion at query time.
 Document suggestion can be done on an indexed *SuggestField*. The document 
 suggester can filter out deleted documents in near real-time. The suggester 
 can filter out documents based on a Filter (note: may change to a non-scoring 
 query?) at query time.
 A custom postings format (CompletionPostingsFormat) is used to index 
 SuggestField(s) and perform document suggestions.
 h4. Usage
 {code:java}
   // hook up custom postings format
   // indexAnalyzer for SuggestField
   Analyzer analyzer = ...
   IndexWriterConfig config = new IndexWriterConfig(analyzer);
   Codec codec = new Lucene50Codec() {
 PostingsFormat completionPostingsFormat = new 
 Completion50PostingsFormat();
 @Override
 public PostingsFormat getPostingsFormatForField(String field) {
   if (isSuggestField(field)) {
 return completionPostingsFormat;
   }
   return super.getPostingsFormatForField(field);
 }
   };
   config.setCodec(codec);
   IndexWriter writer = new IndexWriter(dir, config);
   // index some documents with suggestions
   Document doc = new Document();
   doc.add(new SuggestField(suggest_title, title1, 2));
   doc.add(new SuggestField(suggest_name, name1, 3));
   writer.addDocument(doc)
   ...
   // open an nrt reader for the directory
   DirectoryReader reader = DirectoryReader.open(writer, false);
   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
   // queryAnalyzer will be used to analyze the query string
   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
 queryAnalyzer);
   
   // suggest 10 documents for titl on suggest_title field
   TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10);
 {code}
 h4. Indexing
 Index analyzer set through *IndexWriterConfig*
 {code:java}
 SuggestField(String name, String value, long weight) 
 {code}
 h4. Query
 Query analyzer set through *SuggestIndexSearcher*.
 Hits are collected in descending order of the suggestion's weight 
 {code:java}
 // full options for TopSuggestDocs (TopDocs)
 TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)
 // full options for Collector
 // note: only collects does not score
 void suggest(String field, CharSequence key, int num, Filter filter, 
 TopSuggestDocsCollector collector) 
 {code}
 h4. Analyzer
 *CompletionAnalyzer* can be used instead to wrap another analyzer to tune 
 suggest field only parameters. 
 {code:java}
 CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean 
 preservePositionIncrements, int maxGraphExpansions)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-03-27 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384857#comment-14384857
 ] 

Uwe Schindler commented on LUCENE-6339:
---

Indeed the suggestion does not need to come from a stored field of the result 
document, nice! But one could use that to add additional suggestion 
information, right - instead of the payload?

 [suggest] Near real time Document Suggester
 ---

 Key: LUCENE-6339
 URL: https://issues.apache.org/jira/browse/LUCENE-6339
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 5.0
Reporter: Areek Zillur
Assignee: Areek Zillur
 Fix For: 5.x

 Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, 
 LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch


 The idea is to index documents with one or more *SuggestField*(s) and be able 
 to suggest documents with a *SuggestField* value that matches a given key.
 A SuggestField can be assigned a numeric weight to be used to score the 
 suggestion at query time.
 Document suggestion can be done on an indexed *SuggestField*. The document 
 suggester can filter out deleted documents in near real-time. The suggester 
 can filter out documents based on a Filter (note: may change to a non-scoring 
 query?) at query time.
 A custom postings format (CompletionPostingsFormat) is used to index 
 SuggestField(s) and perform document suggestions.
 h4. Usage
 {code:java}
   // hook up custom postings format
   // indexAnalyzer for SuggestField
   Analyzer analyzer = ...
   IndexWriterConfig config = new IndexWriterConfig(analyzer);
   Codec codec = new Lucene50Codec() {
 PostingsFormat completionPostingsFormat = new 
 Completion50PostingsFormat();
 @Override
 public PostingsFormat getPostingsFormatForField(String field) {
   if (isSuggestField(field)) {
 return completionPostingsFormat;
   }
   return super.getPostingsFormatForField(field);
 }
   };
   config.setCodec(codec);
   IndexWriter writer = new IndexWriter(dir, config);
   // index some documents with suggestions
   Document doc = new Document();
   doc.add(new SuggestField(suggest_title, title1, 2));
   doc.add(new SuggestField(suggest_name, name1, 3));
   writer.addDocument(doc)
   ...
   // open an nrt reader for the directory
   DirectoryReader reader = DirectoryReader.open(writer, false);
   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
   // queryAnalyzer will be used to analyze the query string
   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
 queryAnalyzer);
   
   // suggest 10 documents for titl on suggest_title field
   TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10);
 {code}
 h4. Indexing
 Index analyzer set through *IndexWriterConfig*
 {code:java}
 SuggestField(String name, String value, long weight) 
 {code}
 h4. Query
 Query analyzer set through *SuggestIndexSearcher*.
 Hits are collected in descending order of the suggestion's weight 
 {code:java}
 // full options for TopSuggestDocs (TopDocs)
 TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)
 // full options for Collector
 // note: only collects does not score
 void suggest(String field, CharSequence key, int num, Filter filter, 
 TopSuggestDocsCollector collector) 
 {code}
 h4. Analyzer
 *CompletionAnalyzer* can be used instead to wrap another analyzer to tune 
 suggest field only parameters. 
 {code:java}
 CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean 
 preservePositionIncrements, int maxGraphExpansions)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-03-27 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384859#comment-14384859
 ] 

ASF subversion and git services commented on LUCENE-6339:
-

Commit 1669703 from [~areek] in branch 'dev/trunk'
[ https://svn.apache.org/r1669703 ]

LUCENE-6339: move changes entry from 6.0.0 to 5.1.0

 [suggest] Near real time Document Suggester
 ---

 Key: LUCENE-6339
 URL: https://issues.apache.org/jira/browse/LUCENE-6339
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 5.0
Reporter: Areek Zillur
Assignee: Areek Zillur
 Fix For: 5.x

 Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, 
 LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch


 The idea is to index documents with one or more *SuggestField*(s) and be able 
 to suggest documents with a *SuggestField* value that matches a given key.
 A SuggestField can be assigned a numeric weight to be used to score the 
 suggestion at query time.
 Document suggestion can be done on an indexed *SuggestField*. The document 
 suggester can filter out deleted documents in near real-time. The suggester 
 can filter out documents based on a Filter (note: may change to a non-scoring 
 query?) at query time.
 A custom postings format (CompletionPostingsFormat) is used to index 
 SuggestField(s) and perform document suggestions.
 h4. Usage
 {code:java}
   // hook up custom postings format
   // indexAnalyzer for SuggestField
   Analyzer analyzer = ...
   IndexWriterConfig config = new IndexWriterConfig(analyzer);
   Codec codec = new Lucene50Codec() {
 PostingsFormat completionPostingsFormat = new 
 Completion50PostingsFormat();
 @Override
 public PostingsFormat getPostingsFormatForField(String field) {
   if (isSuggestField(field)) {
 return completionPostingsFormat;
   }
   return super.getPostingsFormatForField(field);
 }
   };
   config.setCodec(codec);
   IndexWriter writer = new IndexWriter(dir, config);
   // index some documents with suggestions
   Document doc = new Document();
   doc.add(new SuggestField(suggest_title, title1, 2));
   doc.add(new SuggestField(suggest_name, name1, 3));
   writer.addDocument(doc)
   ...
   // open an nrt reader for the directory
   DirectoryReader reader = DirectoryReader.open(writer, false);
   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
   // queryAnalyzer will be used to analyze the query string
   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
 queryAnalyzer);
   
   // suggest 10 documents for titl on suggest_title field
   TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10);
 {code}
 h4. Indexing
 Index analyzer set through *IndexWriterConfig*
 {code:java}
 SuggestField(String name, String value, long weight) 
 {code}
 h4. Query
 Query analyzer set through *SuggestIndexSearcher*.
 Hits are collected in descending order of the suggestion's weight 
 {code:java}
 // full options for TopSuggestDocs (TopDocs)
 TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)
 // full options for Collector
 // note: only collects does not score
 void suggest(String field, CharSequence key, int num, Filter filter, 
 TopSuggestDocsCollector collector) 
 {code}
 h4. Analyzer
 *CompletionAnalyzer* can be used instead to wrap another analyzer to tune 
 suggest field only parameters. 
 {code:java}
 CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean 
 preservePositionIncrements, int maxGraphExpansions)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-03-27 Thread Areek Zillur (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384865#comment-14384865
 ] 

Areek Zillur commented on LUCENE-6339:
--

Yes [~thetaphi] that is the idea :). the payload option has been removed 
entirely, now instead of using payloads one can grab any associated values from 
the document with each suggestion

 [suggest] Near real time Document Suggester
 ---

 Key: LUCENE-6339
 URL: https://issues.apache.org/jira/browse/LUCENE-6339
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 5.0
Reporter: Areek Zillur
Assignee: Areek Zillur
 Fix For: 5.x

 Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, 
 LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch


 The idea is to index documents with one or more *SuggestField*(s) and be able 
 to suggest documents with a *SuggestField* value that matches a given key.
 A SuggestField can be assigned a numeric weight to be used to score the 
 suggestion at query time.
 Document suggestion can be done on an indexed *SuggestField*. The document 
 suggester can filter out deleted documents in near real-time. The suggester 
 can filter out documents based on a Filter (note: may change to a non-scoring 
 query?) at query time.
 A custom postings format (CompletionPostingsFormat) is used to index 
 SuggestField(s) and perform document suggestions.
 h4. Usage
 {code:java}
   // hook up custom postings format
   // indexAnalyzer for SuggestField
   Analyzer analyzer = ...
   IndexWriterConfig config = new IndexWriterConfig(analyzer);
   Codec codec = new Lucene50Codec() {
 PostingsFormat completionPostingsFormat = new 
 Completion50PostingsFormat();
 @Override
 public PostingsFormat getPostingsFormatForField(String field) {
   if (isSuggestField(field)) {
 return completionPostingsFormat;
   }
   return super.getPostingsFormatForField(field);
 }
   };
   config.setCodec(codec);
   IndexWriter writer = new IndexWriter(dir, config);
   // index some documents with suggestions
   Document doc = new Document();
   doc.add(new SuggestField(suggest_title, title1, 2));
   doc.add(new SuggestField(suggest_name, name1, 3));
   writer.addDocument(doc)
   ...
   // open an nrt reader for the directory
   DirectoryReader reader = DirectoryReader.open(writer, false);
   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
   // queryAnalyzer will be used to analyze the query string
   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
 queryAnalyzer);
   
   // suggest 10 documents for titl on suggest_title field
   TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10);
 {code}
 h4. Indexing
 Index analyzer set through *IndexWriterConfig*
 {code:java}
 SuggestField(String name, String value, long weight) 
 {code}
 h4. Query
 Query analyzer set through *SuggestIndexSearcher*.
 Hits are collected in descending order of the suggestion's weight 
 {code:java}
 // full options for TopSuggestDocs (TopDocs)
 TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)
 // full options for Collector
 // note: only collects does not score
 void suggest(String field, CharSequence key, int num, Filter filter, 
 TopSuggestDocsCollector collector) 
 {code}
 h4. Analyzer
 *CompletionAnalyzer* can be used instead to wrap another analyzer to tune 
 suggest field only parameters. 
 {code:java}
 CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean 
 preservePositionIncrements, int maxGraphExpansions)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-03-27 Thread Areek Zillur (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384976#comment-14384976
 ] 

Areek Zillur commented on LUCENE-6339:
--

Committed to branch_5x with revision r1669715 (missed out on prepending the 
commit message with jira #)

 [suggest] Near real time Document Suggester
 ---

 Key: LUCENE-6339
 URL: https://issues.apache.org/jira/browse/LUCENE-6339
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 5.0
Reporter: Areek Zillur
Assignee: Areek Zillur
 Fix For: Trunk, 5.x

 Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, 
 LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch


 The idea is to index documents with one or more *SuggestField*(s) and be able 
 to suggest documents with a *SuggestField* value that matches a given key.
 A SuggestField can be assigned a numeric weight to be used to score the 
 suggestion at query time.
 Document suggestion can be done on an indexed *SuggestField*. The document 
 suggester can filter out deleted documents in near real-time. The suggester 
 can filter out documents based on a Filter (note: may change to a non-scoring 
 query?) at query time.
 A custom postings format (CompletionPostingsFormat) is used to index 
 SuggestField(s) and perform document suggestions.
 h4. Usage
 {code:java}
   // hook up custom postings format
   // indexAnalyzer for SuggestField
   Analyzer analyzer = ...
   IndexWriterConfig config = new IndexWriterConfig(analyzer);
   Codec codec = new Lucene50Codec() {
 PostingsFormat completionPostingsFormat = new 
 Completion50PostingsFormat();
 @Override
 public PostingsFormat getPostingsFormatForField(String field) {
   if (isSuggestField(field)) {
 return completionPostingsFormat;
   }
   return super.getPostingsFormatForField(field);
 }
   };
   config.setCodec(codec);
   IndexWriter writer = new IndexWriter(dir, config);
   // index some documents with suggestions
   Document doc = new Document();
   doc.add(new SuggestField(suggest_title, title1, 2));
   doc.add(new SuggestField(suggest_name, name1, 3));
   writer.addDocument(doc)
   ...
   // open an nrt reader for the directory
   DirectoryReader reader = DirectoryReader.open(writer, false);
   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
   // queryAnalyzer will be used to analyze the query string
   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
 queryAnalyzer);
   
   // suggest 10 documents for titl on suggest_title field
   TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10);
 {code}
 h4. Indexing
 Index analyzer set through *IndexWriterConfig*
 {code:java}
 SuggestField(String name, String value, long weight) 
 {code}
 h4. Query
 Query analyzer set through *SuggestIndexSearcher*.
 Hits are collected in descending order of the suggestion's weight 
 {code:java}
 // full options for TopSuggestDocs (TopDocs)
 TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)
 // full options for Collector
 // note: only collects does not score
 void suggest(String field, CharSequence key, int num, Filter filter, 
 TopSuggestDocsCollector collector) 
 {code}
 h4. Analyzer
 *CompletionAnalyzer* can be used instead to wrap another analyzer to tune 
 suggest field only parameters. 
 {code:java}
 CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean 
 preservePositionIncrements, int maxGraphExpansions)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-03-26 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382992#comment-14382992
 ] 

Michael McCandless commented on LUCENE-6339:


Patch looks great!

Can we pull out SuggestScoreDocPQ into its own .java source?  Should its 
lessThan method tie break by docID?

I think the logic to compute maxQueueSize in getMaxTopNSearcherQueueSize could 
possibly overflow int?  Maybe use long, and then cast back to int after the 
Math.min?


 [suggest] Near real time Document Suggester
 ---

 Key: LUCENE-6339
 URL: https://issues.apache.org/jira/browse/LUCENE-6339
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 5.0
Reporter: Areek Zillur
Assignee: Areek Zillur
 Fix For: 5.0

 Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, 
 LUCENE-6339.patch, LUCENE-6339.patch


 The idea is to index documents with one or more *SuggestField*(s) and be able 
 to suggest documents with a *SuggestField* value that matches a given key.
 A SuggestField can be assigned a numeric weight to be used to score the 
 suggestion at query time.
 Document suggestion can be done on an indexed *SuggestField*. The document 
 suggester can filter out deleted documents in near real-time. The suggester 
 can filter out documents based on a Filter (note: may change to a non-scoring 
 query?) at query time.
 A custom postings format (CompletionPostingsFormat) is used to index 
 SuggestField(s) and perform document suggestions.
 h4. Usage
 {code:java}
   // hook up custom postings format
   // indexAnalyzer for SuggestField
   Analyzer analyzer = ...
   IndexWriterConfig config = new IndexWriterConfig(analyzer);
   Codec codec = new Lucene50Codec() {
 PostingsFormat completionPostingsFormat = new 
 Completion50PostingsFormat();
 @Override
 public PostingsFormat getPostingsFormatForField(String field) {
   if (isSuggestField(field)) {
 return completionPostingsFormat;
   }
   return super.getPostingsFormatForField(field);
 }
   };
   config.setCodec(codec);
   IndexWriter writer = new IndexWriter(dir, config);
   // index some documents with suggestions
   Document doc = new Document();
   doc.add(new SuggestField(suggest_title, title1, 2));
   doc.add(new SuggestField(suggest_name, name1, 3));
   writer.addDocument(doc)
   ...
   // open an nrt reader for the directory
   DirectoryReader reader = DirectoryReader.open(writer, false);
   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
   // queryAnalyzer will be used to analyze the query string
   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
 queryAnalyzer);
   
   // suggest 10 documents for titl on suggest_title field
   TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10);
 {code}
 h4. Indexing
 Index analyzer set through *IndexWriterConfig*
 {code:java}
 SuggestField(String name, String value, long weight) 
 {code}
 h4. Query
 Query analyzer set through *SuggestIndexSearcher*.
 Hits are collected in descending order of the suggestion's weight 
 {code:java}
 // full options for TopSuggestDocs (TopDocs)
 TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)
 // full options for Collector
 // note: only collects does not score
 void suggest(String field, CharSequence key, int num, Filter filter, 
 TopSuggestDocsCollector collector) 
 {code}
 h4. Analyzer
 *CompletionAnalyzer* can be used instead to wrap another analyzer to tune 
 suggest field only parameters. 
 {code:java}
 CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean 
 preservePositionIncrements, int maxGraphExpansions)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-03-24 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377513#comment-14377513
 ] 

Michael McCandless commented on LUCENE-6339:


New patch looks great, thanks [~areek]!

In TopSuggestDocsCollector:

  - In collect, we seem to assume the suggest searcher will never call
collect more than num times?  How is that?  If so, can you add that to
the javadocs, and maybe add an assert upto  num in collect?

  - Can we just allocate scoreDocs up front instead of lazily?

  - In the javadocs, instead of one hit can be... maybe one doc can
be...?  Hit is a tricky word in this context since it could be a doc
or a suggestion...

In SuggestIndexSearcher, does it really ever make sense to take a
generic Collector/LeafCollector?  Can we instead just strongly type
the params to all the methods to be TopSuggestDocsCollector?

In case a filter has to be applied, the queue size is doubled is not
quite correct?  Maybe change the logic there so the int queueSize is
first computed, and then if filter is enabled, it's doubled?

Can we remove the separate WeightProcessor class and just make
encode/decode static methods on NRTSuggester?  We can add back
abstractions later if users somehow need control over weight
encoding...

Can we add a test that tests the extreme case of nearly all docs
filtered out and another test with nearly all docs deleted?


 [suggest] Near real time Document Suggester
 ---

 Key: LUCENE-6339
 URL: https://issues.apache.org/jira/browse/LUCENE-6339
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 5.0
Reporter: Areek Zillur
Assignee: Areek Zillur
 Fix For: 5.0

 Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch


 The idea is to index documents with one or more *SuggestField*(s) and be able 
 to suggest documents with a *SuggestField* value that matches a given key.
 A SuggestField can be assigned a numeric weight to be used to score the 
 suggestion at query time.
 Document suggestion can be done on an indexed *SuggestField*. The document 
 suggester can filter out deleted documents in near real-time. The suggester 
 can filter out documents based on a Filter (note: may change to a non-scoring 
 query?) at query time.
 A custom postings format (CompletionPostingsFormat) is used to index 
 SuggestField(s) and perform document suggestions.
 h4. Usage
 {code:java}
   // hook up custom postings format
   // indexAnalyzer for SuggestField
   Analyzer analyzer = ...
   IndexWriterConfig config = new IndexWriterConfig(analyzer);
   Codec codec = new Lucene50Codec() {
 PostingsFormat completionPostingsFormat = new 
 Completion50PostingsFormat();
 @Override
 public PostingsFormat getPostingsFormatForField(String field) {
   if (isSuggestField(field)) {
 return completionPostingsFormat;
   }
   return super.getPostingsFormatForField(field);
 }
   };
   config.setCodec(codec);
   IndexWriter writer = new IndexWriter(dir, config);
   // index some documents with suggestions
   Document doc = new Document();
   doc.add(new SuggestField(suggest_title, title1, 2));
   doc.add(new SuggestField(suggest_name, name1, 3));
   writer.addDocument(doc)
   ...
   // open an nrt reader for the directory
   DirectoryReader reader = DirectoryReader.open(writer, false);
   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
   // queryAnalyzer will be used to analyze the query string
   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
 queryAnalyzer);
   
   // suggest 10 documents for titl on suggest_title field
   TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10);
 {code}
 h4. Indexing
 Index analyzer set through *IndexWriterConfig*
 {code:java}
 SuggestField(String name, String value, long weight) 
 {code}
 h4. Query
 Query analyzer set through *SuggestIndexSearcher*.
 Hits are collected in descending order of the suggestion's weight 
 {code:java}
 // full options for TopSuggestDocs (TopDocs)
 TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)
 // full options for Collector
 // note: only collects does not score
 void suggest(String field, CharSequence key, int maxNumPerLeaf, Filter 
 filter, Collector collector)
 {code}
 h4. Analyzer
 *CompletionAnalyzer* can be used instead to wrap another analyzer to tune 
 suggest field only parameters. 
 {code:java}
 CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean 
 preservePositionIncrements, int maxGraphExpansions)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail:

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-03-06 Thread Areek Zillur (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14351191#comment-14351191
 ] 

Areek Zillur commented on LUCENE-6339:
--

{quote}
you fetch the checksum for the dict file in {{ CompletionFieldsProducer#ctor }} 
via {{ CodecUtil.retrieveChecksum(dictIn); } but you ignore it's return value, 
was this intended? I think you don't wanna do that here? Did you intend to 
check the entire file?
I wonder if we should just write one file for both, the index and the FSTs? 
What's the benefit from having two?
{quote}
This was intentional, used the same convention for 
{{BlockTreeTermsReader#termsIn}} here. The thought was doing the checksum check 
would be very costly, in most cases the {{dict}} file would be large?
If we write one file instead of two, then the checksum check would be more 
expensive for the index then now?

 [suggest] Near real time Document Suggester
 ---

 Key: LUCENE-6339
 URL: https://issues.apache.org/jira/browse/LUCENE-6339
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 5.0
Reporter: Areek Zillur
Assignee: Areek Zillur
 Fix For: 5.0

 Attachments: LUCENE-6339.patch, LUCENE-6339.patch


 The idea is to index documents with one or more *SuggestField*(s) and be able 
 to suggest documents with a *SuggestField* value that matches a given key.
 A SuggestField can be assigned a numeric weight to be used to score the 
 suggestion at query time.
 Document suggestion can be done on an indexed *SuggestField*. The document 
 suggester can filter out deleted documents in near real-time. The suggester 
 can filter out documents based on a Filter (note: may change to a non-scoring 
 query?) at query time.
 A custom postings format (CompletionPostingsFormat) is used to index 
 SuggestField(s) and perform document suggestions.
 h4. Usage
 {code:java}
   // hook up custom postings format
   // indexAnalyzer for SuggestField
   Analyzer analyzer = ...
   IndexWriterConfig config = new IndexWriterConfig(analyzer);
   Codec codec = new Lucene50Codec() {
 @Override
 public PostingsFormat getPostingsFormatForField(String field) {
   if (isSuggestField(field)) {
 return new 
 CompletionPostingsFormat(super.getPostingsFormatForField(field));
   }
   return super.getPostingsFormatForField(field);
 }
   };
   config.setCodec(codec);
   IndexWriter writer = new IndexWriter(dir, config);
   // index some documents with suggestions
   Document doc = new Document();
   doc.add(new SuggestField(suggest_title, title1, 2));
   doc.add(new SuggestField(suggest_name, name1, 3));
   writer.addDocument(doc)
   ...
   // open an nrt reader for the directory
   DirectoryReader reader = DirectoryReader.open(writer, false);
   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
   // queryAnalyzer will be used to analyze the query string
   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
 queryAnalyzer);
   
   // suggest 10 documents for titl on suggest_title field
   TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10);
 {code}
 h4. Indexing
 Index analyzer set through *IndexWriterConfig*
 {code:java}
 SuggestField(String name, String value, long weight) 
 {code}
 h4. Query
 Query analyzer set through *SuggestIndexSearcher*.
 Hits are collected in descending order of the suggestion's weight 
 {code:java}
 // full options for TopSuggestDocs (TopDocs)
 TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)
 // full options for Collector
 // note: only collects does not score
 void suggest(String field, CharSequence key, int maxNumPerLeaf, Filter 
 filter, Collector collector)
 {code}
 h4. Analyzer
 *CompletionAnalyzer* can be used instead to wrap another analyzer to tune 
 suggest field only parameters. 
 {code:java}
 CompletionAnalyzer completionAnalyzer = new CompletionAnalyzer(analyzer);
 completionAnalyzer.setPreserveSep(..)
 completionAnalyzer.setPreservePositionsIncrements(..)
 completionAnalyzer.setMaxGraphExpansions(..)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-03-06 Thread Simon Willnauer (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14350351#comment-14350351
 ] 

Simon Willnauer commented on LUCENE-6339:
-

Hey Areek, I agree with mike this looks awesome... lemme give you some comments

 * can we make {{CompletionAnalyzer}} immutable by any chance? I'd really like 
to not have setters if possible? For that I guess it's constants need to be 
public as well?
 * is {{private boolean isReservedInputCharacter(char c) }} needed since we 
then afterwards check it again in the {{checkKey}} method, maybe you just wanna 
use a switch here?
 * In {{CompletionFieldsConsumer#close()}} I think we need to make sure 
{{IOUtils.close(dictOut);}} is also called if an exception is hit?
 * do we need the extra {{InputStreamDataInput}} in 
{{CompletionTermWriter#parse}}, I mean it's a byte input stream so we should be 
able to read all of the bytes?
 * {{SuggestPayload}} doesn't need a default ctor
 * can we use {{ if (success == false) }} instead of {{ if (!success) }}  as a 
pattern in general?
 * use try / finally in {{CompletionFieldsProducer#close()}} to ensure all 
resource are closed or pass both the dict and {{ delegateFieldsProducer }} to 
IOUtils#close()?
 * you fetch the checksum for the dict file in {{ CompletionFieldsProducer#ctor 
}} via  {{ CodecUtil.retrieveChecksum(dictIn); } but you ignore it's return 
value, was this intended? I think you don't wanna do that here? Did you intend 
to check the entire file?
 * I wonder if we should just write one file for both, the index and the FSTs? 
What's the benefit from having two?

For loading the dict you put a comment in there sayingm {{ // is there a better 
way of doing this?}}

I think what you need to do is this:

{code}
public synchronized SegmentLookup lookup() throws IOException {
  if (lookup == null) {
 try (IndexInput dictClone = dictIn.clone()) { // let multiple fields load 
concurrently
 dictClone.seek(offset); // this is your field private clone 
 lookup = NRTSuggester.load(dictClone);
 }
  }
  return lookup;
}
{code}

I'd appreciate a tests that this works just fine ie. loading multiple FSTs 
concurrently.

I didn't get further than this due to the lack of time but I will come back to 
this either today or tomorrow. Good stuff Areek

 [suggest] Near real time Document Suggester
 ---

 Key: LUCENE-6339
 URL: https://issues.apache.org/jira/browse/LUCENE-6339
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 5.0
Reporter: Areek Zillur
Assignee: Areek Zillur
 Fix For: 5.0

 Attachments: LUCENE-6339.patch


 The idea is to index documents with one or more *SuggestField*(s) and be able 
 to suggest documents with a *SuggestField* value that matches a given key.
 A SuggestField can be assigned a numeric weight to be used to score the 
 suggestion at query time.
 Document suggestion can be done on an indexed *SuggestField*. The document 
 suggester can filter out deleted documents in near real-time. The suggester 
 can filter out documents based on a Filter (note: may change to a non-scoring 
 query?) at query time.
 A custom postings format (CompletionPostingsFormat) is used to index 
 SuggestField(s) and perform document suggestions.
 h4. Usage
 {code:java}
   // hook up custom postings format
   // indexAnalyzer for SuggestField
   Analyzer analyzer = ...
   IndexWriterConfig config = new IndexWriterConfig(analyzer);
   Codec codec = new Lucene50Codec() {
 @Override
 public PostingsFormat getPostingsFormatForField(String field) {
   if (isSuggestField(field)) {
 return new 
 CompletionPostingsFormat(super.getPostingsFormatForField(field));
   }
   return super.getPostingsFormatForField(field);
 }
   };
   config.setCodec(codec);
   IndexWriter writer = new IndexWriter(dir, config);
   // index some documents with suggestions
   Document doc = new Document();
   doc.add(new SuggestField(suggest_title, title1, 2));
   doc.add(new SuggestField(suggest_name, name1, 3));
   writer.addDocument(doc)
   ...
   // open an nrt reader for the directory
   DirectoryReader reader = DirectoryReader.open(writer, false);
   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
   // queryAnalyzer will be used to analyze the query string
   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
 queryAnalyzer);
   
   // suggest 10 documents for titl on suggest_title field
   TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10);
 {code}
 h4. Indexing
 Index analyzer set through *IndexWriterConfig*
 {code:java}
 SuggestField(String name, String value, long weight) 
 {code}
 h4. Query
 Query analyzer set through *SuggestIndexSearcher*.

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-03-05 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14349653#comment-14349653
 ] 

Michael McCandless commented on LUCENE-6339:


This looks really nice!

I think AutomatonUtil is (nearly?) the same thing as
TokenStreamToAutomaton?  Can we somehow consolidate the two?

When I try to ant test with the patch on current 5.x some things are
angry:

{noformat}
[mkdir] Created dir: /l/areek/lucene/build/suggest/classes/java
[javac] Compiling 65 source files to 
/l/areek/lucene/build/suggest/classes/java
[javac] 
/l/areek/lucene/suggest/src/java/org/apache/lucene/search/suggest/analyzing/AnalyzingInfixSuggester.java:597:
 warning: [cast] redundant cast to TopFieldDocs
[javac]   TopFieldDocs hits = (TopFieldDocs) c.topDocs();
[javac]   ^
[javac] 
/l/areek/lucene/suggest/src/java/org/apache/lucene/search/suggest/document/NRTSuggester.java:208:
 error: local variable collector is accessed from within inner class; needs to 
be declared final
[javac]   collector.collect(docID);
[javac]   ^
[javac] 
/l/areek/lucene/suggest/src/java/org/apache/lucene/search/suggest/document/CompletionFieldsProducer.java:164:
 error: CompletionFieldsProducer.CompletionsTermsReader is not abstract and 
does not override abstract method getChildResources() in Accountable
[javac]   private class CompletionsTermsReader implements Accountable {
[javac]   ^
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] 2 errors
[javac] 1 warning
{noformat}

Not sure why we need an FSTBuilder inside the NRTSuggesterBuilder;
can't the first be absorbed into the latter?  Can NRTSuggesterBuilder
be package private?  Ie the public API here is the postings format and
SuggestIndexSearcher / SuggestTopDocs?  I think other things can be
private, e.g. CompletionTokenStream.

Can you use CodecUtil.writeIndexHeader when storing the FST?  It also
stores the segment ID and file extension in the header.  And then
CodecUtil.checkIndexHeader at read-time.

CompletionTermsReader.lookup() should be sync'd?   Else two threads
could try to use the IndexInput (dictIn) at once?

Maybe we should move the code in SuggestIndexSearcher.suggest into
a new TopSuggestDocs.merge method?

Do we really need the separate SegmentLookup interface?  Seems like we
can just invoke lookup method directly on CompletionTerms?

Why do we allow -1 weight?  And why do we restrict to int not long
(other suggesters are long I think, though it does seem like
overkill!).


 [suggest] Near real time Document Suggester
 ---

 Key: LUCENE-6339
 URL: https://issues.apache.org/jira/browse/LUCENE-6339
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Affects Versions: 5.0
Reporter: Areek Zillur
Assignee: Areek Zillur
 Fix For: 5.0

 Attachments: LUCENE-6339.patch


 The idea is to index documents with one or more *SuggestField*(s) and be able 
 to suggest documents with a *SuggestField* value that matches a given key.
 A SuggestField can be assigned a numeric weight to be used to score the 
 suggestion at query time.
 Document suggestion can be done on an indexed *SuggestField*. The document 
 suggester can filter out deleted documents in near real-time. The suggester 
 can filter out documents based on a Filter (note: may change to a non-scoring 
 query?) at query time.
 A custom postings format (CompletionPostingsFormat) is used to index 
 SuggestField(s) and perform document suggestions.
 h4. Usage
 {code:java}
   // hook up custom postings format
   // indexAnalyzer for SuggestField
   Analyzer analyzer = ...
   IndexWriterConfig config = new IndexWriterConfig(analyzer);
   Codec codec = new Lucene50Codec() {
 @Override
 public PostingsFormat getPostingsFormatForField(String field) {
   if (isSuggestField(field)) {
 return new 
 CompletionPostingsFormat(super.getPostingsFormatForField(field));
   }
   return super.getPostingsFormatForField(field);
 }
   };
   config.setCodec(codec);
   IndexWriter writer = new IndexWriter(dir, config);
   // index some documents with suggestions
   Document doc = new Document();
   doc.add(new SuggestField(suggest_title, title1, 2));
   doc.add(new SuggestField(suggest_name, name1, 3));
   writer.addDocument(doc)
   ...
   // open an nrt reader for the directory
   DirectoryReader reader = DirectoryReader.open(writer, false);
   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
   // queryAnalyzer will be used to analyze the query string
   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader,

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

27 matches

Site Navigation

Mail list logo

Footer information