[jira] Updated: (NUTCH-573) Multiple Domains - Query Search
[ https://issues.apache.org/jira/browse/NUTCH-573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-573: - pushing this out per http://bit.ly/c7tBv9 Multiple Domains - Query Search --- Key: NUTCH-573 URL: https://issues.apache.org/jira/browse/NUTCH-573 Project: Nutch Issue Type: Improvement Components: searcher Affects Versions: 0.9.0 Environment: All Reporter: Rajasekar Karthik Assignee: Enis Soztutar Attachments: multiTermQuery_v1.patch Searching multiple domains can be done on Lucene - nut not that efficiently on nutch. Query: +content:abc +(sitewww.aaa.com site:www.bbb.com) works on lucene but the same concept does not work on nutch. In Lucene, it works with org.apache.lucene.analysis.KeywordAnalyzer org.apache.lucene.analysis.standard.StandardAnalyzer but NOT on org.apache.lucene.analysis.SimpleAnalyzer Is Nutch analyzer based on SimpleAnalyzer? In this case, is there a workaround to make this work? Is there an option to change what analyzer nutch is using? Just FYI, another solution (inefficient I believe) which seems to be working on nutch query -site:ccc.com -site:ddd.com -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (NUTCH-573) Multiple Domains - Query Search
[ https://issues.apache.org/jira/browse/NUTCH-573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated NUTCH-573: - Patch Info: [Patch Available] Multiple Domains - Query Search --- Key: NUTCH-573 URL: https://issues.apache.org/jira/browse/NUTCH-573 Project: Nutch Issue Type: Improvement Components: searcher Affects Versions: 0.9.0 Environment: All Reporter: Rajasekar Karthik Assignee: Enis Soztutar Fix For: 1.0.0 Attachments: multiTermQuery_v1.patch Searching multiple domains can be done on Lucene - nut not that efficiently on nutch. Query: +content:abc +(sitewww.aaa.com site:www.bbb.com) works on lucene but the same concept does not work on nutch. In Lucene, it works with org.apache.lucene.analysis.KeywordAnalyzer org.apache.lucene.analysis.standard.StandardAnalyzer but NOT on org.apache.lucene.analysis.SimpleAnalyzer Is Nutch analyzer based on SimpleAnalyzer? In this case, is there a workaround to make this work? Is there an option to change what analyzer nutch is using? Just FYI, another solution (inefficient I believe) which seems to be working on nutch query -site:ccc.com -site:ddd.com -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (NUTCH-573) Multiple Domains - Query Search
[ https://issues.apache.org/jira/browse/NUTCH-573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated NUTCH-573: Attachment: multiTermQuery_v1.patch Here is a patch that enables querying multiple values for the same field. #The query syntax is changed to enable [field:]term1(,term2)* type queries, where multiple terms are converted to a boolean OR query. #Query.Clause, Query.Term, and Query.Phrase is changed significantly. This is an initial version of the patch for review, today I will test it a little bit more. Multiple Domains - Query Search --- Key: NUTCH-573 URL: https://issues.apache.org/jira/browse/NUTCH-573 Project: Nutch Issue Type: Improvement Components: searcher Affects Versions: 0.6, 0.7, 0.7.1, 0.7.2, 0.8, 0.8.1, 0.9.0 Environment: All Reporter: Rajasekar Karthik Assignee: Enis Soztutar Priority: Minor Attachments: multiTermQuery_v1.patch Searching multiple domains can be done on Lucene - nut not that efficiently on nutch. Query: +content:abc +(sitewww.aaa.com site:www.bbb.com) works on lucene but the same concept does not work on nutch. In Lucene, it works with org.apache.lucene.analysis.KeywordAnalyzer org.apache.lucene.analysis.standard.StandardAnalyzer but NOT on org.apache.lucene.analysis.SimpleAnalyzer Is Nutch analyzer based on SimpleAnalyzer? In this case, is there a workaround to make this work? Is there an option to change what analyzer nutch is using? Just FYI, another solution (inefficient I believe) which seems to be working on nutch query -site:ccc.com -site:ddd.com -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.