RE: org.apache.lucene.search.highlight.Highlighter
Hey Lucene-Developers Finally found the problem with Highlighter SRC The Search SRC using search.highlight.Highlighter depends on storage of the HTML Content (FIELD_NAME) while Indexing, If the Content is Stored as FileInputStream is = new FileInputStream(File); reader = new BufferedReader(new InputStreamReader(is)); doc.add(Field.Text(contents, reader)); then the search.highlight.Highlighter raises a null Pointer Exception on the FIELD_NAME Content java.lang.NullPointerException at search.highlight.Highlighter.getBestDocFragments(Highlighter.java:141) at search.highlight.Highlighter.getBestFragments(Highlighter.java:80) at search.highlight.Highlighter.getBestFragments(Highlighter.java:328) at org.apache.lucene.demo.Search.searchIndex1(Search.java:84) atorg.apache.lucene.demo.Search.main(Search.java:107) But if u use Field ff = new Field(contents, proceStr, true, true, true); (Where proceStr = Contents of HTML) Then in such case search.highlight.Highlighter returns a correct Search + Highlighter (bold) implementation of the Indexed segment. Now Please some body who is mature more enough to improve this code please do. Peace at last . :) Karthik -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Monday, May 24, 2004 10:40 PM To: Lucene Users List Subject: Re: org.apache.lucene.search.highlight.Highlighter On May 24, 2004, at 5:11 AM, Karthik N S wrote: I was broswing thru CVS and found the SRC for IndexWriter2.java written by Ivaylo Zlatev on feb 2002, Where do you see this? It is not in the current CVS that I can tell. The Tecnique of using RAMDirectory, my Query has really become faster access , So hence plan to use it during Indexing process also. I'm confused by what you're after. You can index into a RAMDirectory, no problem, and then persist it to a FSDirectory when you are done with the current codebase. Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: org.apache.lucene.search.highlight.Highlighter
If the Content is Stored as... doc.add(Field.Text(contents, reader)); Thats just it. It's not stored : see the javadocs for Field.text(string,reader): Constructs a Reader-valued Field that is tokenized and indexed, but is not stored in the index As opposed to : Field.Text(String name, String value) which says: Constructs a String-valued Field that is tokenized and indexed, and is stored in the index, for return with hits. So, you're getting nulls because you're not storing the field for subsequent retrieval. Now Please some body who is mature more enough to improve this code please do. Are you deliberately trying to be obnoxious or is it just a natural gift? You'll find people here more helpful if you dont resort to insulting them. :-) - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: org.apache.lucene.search.highlight.Highlighter
Hey Lucene-Developers I was broswing thru CVS and found the SRC for IndexWriter2.java written by Ivaylo Zlatev on feb 2002, My concern is, Does this piece of code really work , if so state an example [ present Lucene-final 1.3 version ] or Is it discarded from the [ present Lucene-final 1.3 version ] The Tecnique of using RAMDirectory, my Query has really become faster access , So hence plan to use it during Indexing process also. karthik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: org.apache.lucene.search.highlight.Highlighter
That version of IndexWriter was never included in Lucene. Use various IndexWriter parameters (instance variables) to tune indexing. One of my articles desribes how to use them, if Javadocs are too terse. Otis --- Karthik N S [EMAIL PROTECTED] wrote: Hey Lucene-Developers I was broswing thru CVS and found the SRC for IndexWriter2.java written by Ivaylo Zlatev on feb 2002, My concern is, Does this piece of code really work , if so state an example [ present Lucene-final 1.3 version ] or Is it discarded from the [ present Lucene-final 1.3 version ] The Tecnique of using RAMDirectory, my Query has really become faster access , So hence plan to use it during Indexing process also. karthik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: org.apache.lucene.search.highlight.Highlighter
On May 24, 2004, at 5:11 AM, Karthik N S wrote: I was broswing thru CVS and found the SRC for IndexWriter2.java written by Ivaylo Zlatev on feb 2002, Where do you see this? It is not in the current CVS that I can tell. The Tecnique of using RAMDirectory, my Query has really become faster access , So hence plan to use it during Indexing process also. I'm confused by what you're after. You can index into a RAMDirectory, no problem, and then persist it to a FSDirectory when you are done with the current codebase. Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: org.apache.lucene.search.highlight.Highlighter
Hi Please can some body give me a simple Example of org.apache.lucene.search.highlight.Highlighter I am trying to use it but unsucessfull Karthik -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Thursday, May 20, 2004 2:08 AM To: [EMAIL PROTECTED] Subject: Re: org.apache.lucene.search.highlight.Highlighter Was Investigating,found some Compile time error.. I see the code you have is taken from the example in the javadocs. Unfortunately that example wasn't complete because the class didnt include the method defined in the Formatter interface. I have updated the Javadocs to correct this oversight. To correct your problem either make your class implement the Formatter interface to perform your choice of custom formatting or remove the this parameter from your call to create a new Highlighter with the default Formatter implementation. Thanks for highlighting the problem with the Javadocs... Cheers Mark - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: org.apache.lucene.search.highlight.Highlighter
Hi, Here is the documentation Mark Harwood included in the original package. I followed his directorions and it worked for me. Let me know if this doesn't do it for you. Claude On May 21, 2004, at 4:29 AM, Karthik N S wrote: Hi Please can some body give me a simple Example of org.apache.lucene.search.highlight.Highlighter I am trying to use it but unsucessfull Karthik image.tiff> WITH WARM REGARDS HAVE A NICE DAY [ N.S.KARTHIK] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: org.apache.lucene.search.highlight.Highlighter
Arrgh the attachment didn't make it here it goes, sorry: //perform a standard lucene query searcher = new IndexSearcher(ramDir); Analyzer analyzer=new StandardAnalyzer(); Query query = QueryParser.parse(Kenne*, FIELD_NAME, analyzer); query=query.rewrite(reader); //necessary to expand search terms Hits hits = searcher.search(query); //create an instance of the highlighter with the tags used to surround highlighted text QueryHighlightExtractor highlighter = new QueryHighlightExtractor(query, new StandardAnalyzer(), b, /b); for (int i = 0; i hits.length(); i++) { String text = hits.doc(i).get(FIELD_NAME); //call to highlight text with chosen tags String highlightedText = highlighter.highlightText(text); System.out.println(highlightedText); } If your documents are large you can select only the best fragments from each document like this: //...as above example int highlightFragmentSizeInBytes = 80; int maxNumFragmentsRequired = 4; String fragmentSeparator=...; for (int i = 0; i hits.length(); i++) { String text = hits.doc(i).get(FIELD_NAME); String highlightedText = highlighter.getBestFragments(text, highlightFragmentSizeInBytes,maxNumFragmentsRequired,fragmentSeparator); System.out.println(highlightedText); } On May 21, 2004, at 9:22 AM, Claude Devarenne wrote: Hi, Here is the documentation Mark Harwood included in the original package. I followed his directorions and it worked for me. Let me know if this doesn't do it for you. Claude On May 21, 2004, at 4:29 AM, Karthik N S wrote: Hi Please can some body give me a simple Example of org.apache.lucene.search.highlight.Highlighter I am trying to use it but unsucessfull Karthik image.tiff WITH WARM REGARDS HAVE A NICE DAY [ N.S.KARTHIK] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: org.apache.lucene.search.highlight.Highlighter
Hi Claude, that example code you provided is out of date. For all concerned - the highlighter code was refactored about a month ago and then moved into the Sandbox. Want the latest version? - get the latest code from the sandbox CVS. Want the latest docs? - Run javadoc on the above. There is a basic example of highlighter use in the package-level javadocs and more extensive examples in the JUnit test that accompanies the source code. Hope this helps clarify things. Mark ps Bruce, I know you were interested in providing an alternative Fragmenter implementation for the highlighter that detects sentence boundaries. You may want to look at LingPipe which has a heuristic sentence boundary detector. ( http://threattracker.com:8080/lingpipe-demo/demo.html ) I took a quick look at it but it has its own tokenizer that would be difficult to make work with the tokenstream used to identify query terms. At least the code gives some examples of the heuristics involved in detecting sentence boundaries. For my own apps I find the standard Fragmenter implementation suffices. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: org.apache.lucene.search.highlight.Highlighter
Thanks for highlighting the problem with the Javadocs... Groan. :) Regards, Bruce Ritchie smime.p7s Description: S/MIME cryptographic signature
Re: org.apache.lucene.search.highlight.Highlighter
Was Investigating,found some Compile time error.. I see the code you have is taken from the example in the javadocs. Unfortunately that example wasn't complete because the class didnt include the method defined in the Formatter interface. I have updated the Javadocs to correct this oversight. To correct your problem either make your class implement the Formatter interface to perform your choice of custom formatting or remove the this parameter from your call to create a new Highlighter with the default Formatter implementation. Thanks for highlighting the problem with the Javadocs... Cheers Mark - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]