RE: org.apache.lucene.search.highlight.Highlighter

2004-05-25 Thread Karthik N S
Hey
Lucene-Developers Finally found the problem with Highlighter SRC

The Search SRC using search.highlight.Highlighter depends on
storage of the HTML Content  (FIELD_NAME) while Indexing,

If the Content is Stored as

 FileInputStream is = new FileInputStream(File);
 reader = new BufferedReader(new InputStreamReader(is));
 doc.add(Field.Text(contents, reader));

then the search.highlight.Highlighter raises a null Pointer Exception on the
FIELD_NAME  Content


java.lang.NullPointerException
at search.highlight.Highlighter.getBestDocFragments(Highlighter.java:141)
at search.highlight.Highlighter.getBestFragments(Highlighter.java:80)
at search.highlight.Highlighter.getBestFragments(Highlighter.java:328)
at org.apache.lucene.demo.Search.searchIndex1(Search.java:84)
atorg.apache.lucene.demo.Search.main(Search.java:107)


But if u use

   Field ff = new Field(contents, proceStr, true, true, true);

  (Where proceStr = Contents of HTML)

Then in such case


   search.highlight.Highlighter   returns a correct Search + Highlighter
(bold) implementation of the Indexed segment.



Now Please some body who is
mature more enough to improve this code please do.


Peace at last  . :)
Karthik




-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: Monday, May 24, 2004 10:40 PM
To: Lucene Users List
Subject: Re: org.apache.lucene.search.highlight.Highlighter


On May 24, 2004, at 5:11 AM, Karthik N S wrote:
 I was broswing thru CVS  and found the SRC for  IndexWriter2.java
 written
 by Ivaylo Zlatev on feb 2002,

Where do you see this?  It is not in the current CVS that I can tell.

 The Tecnique of using RAMDirectory, my Query has really become faster
 access
 ,
 So hence plan to use it  during Indexing process also.

I'm confused by what you're after.  You can index into a RAMDirectory,
no problem, and then persist it to a FSDirectory when you are done with
the current codebase.

Erik


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: org.apache.lucene.search.highlight.Highlighter

2004-05-25 Thread markharw00d
If the Content is Stored as...
doc.add(Field.Text(contents, reader));

Thats just it. It's not stored : see the javadocs for Field.text(string,reader):
Constructs a Reader-valued Field that is tokenized and indexed, but is not stored in 
the index

As opposed to :
  Field.Text(String name, String value)
which says:
Constructs a String-valued Field that is tokenized and indexed, and is stored in the 
index, for return with hits.

So, you're getting nulls because you're not storing the field for subsequent retrieval.

Now Please some body who is
mature more enough to improve this code please do.

Are you deliberately trying to be obnoxious or is it just a natural gift?
You'll find people here more helpful if you dont resort to insulting them.
:-)


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: org.apache.lucene.search.highlight.Highlighter

2004-05-24 Thread Karthik N S
Hey
Lucene-Developers

I was broswing thru CVS  and found the SRC for  IndexWriter2.java  written
by Ivaylo Zlatev on feb 2002,

My concern is, Does this piece of code  really work ,

if so  state an example [ present Lucene-final 1.3 version ]
   or
Is it  discarded from the [ present Lucene-final 1.3 version ]


The Tecnique of using RAMDirectory, my Query has really become faster access
,
So hence plan to use it  during Indexing process also.



karthik





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: org.apache.lucene.search.highlight.Highlighter

2004-05-24 Thread Otis Gospodnetic
That version of IndexWriter was never included in Lucene.
Use various IndexWriter parameters (instance variables) to tune
indexing.  One of my articles desribes how to use them, if Javadocs are
too terse.

Otis

--- Karthik N S [EMAIL PROTECTED] wrote:
 Hey
 Lucene-Developers
 
 I was broswing thru CVS  and found the SRC for  IndexWriter2.java 
 written
 by Ivaylo Zlatev on feb 2002,
 
 My concern is, Does this piece of code  really work ,
 
 if so  state an example [ present Lucene-final 1.3 version ]
or
 Is it  discarded from the [ present Lucene-final 1.3 version ]
 
 
 The Tecnique of using RAMDirectory, my Query has really become faster
 access
 ,
 So hence plan to use it  during Indexing process also.
 
 
 
 karthik
 
 
 
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: org.apache.lucene.search.highlight.Highlighter

2004-05-24 Thread Erik Hatcher
On May 24, 2004, at 5:11 AM, Karthik N S wrote:
I was broswing thru CVS  and found the SRC for  IndexWriter2.java  
written
by Ivaylo Zlatev on feb 2002,
Where do you see this?  It is not in the current CVS that I can tell.
The Tecnique of using RAMDirectory, my Query has really become faster 
access
,
So hence plan to use it  during Indexing process also.
I'm confused by what you're after.  You can index into a RAMDirectory, 
no problem, and then persist it to a FSDirectory when you are done with 
the current codebase.

Erik
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: org.apache.lucene.search.highlight.Highlighter

2004-05-21 Thread Karthik N S
Hi

  Please can some body give me a simple Example of
  org.apache.lucene.search.highlight.Highlighter

  I am trying to use it but unsucessfull


Karthik


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Thursday, May 20, 2004 2:08 AM
To: [EMAIL PROTECTED]
Subject: Re: org.apache.lucene.search.highlight.Highlighter


Was Investigating,found some Compile time error..

I see the code you have is taken from the example in the javadocs.
Unfortunately that example wasn't complete because the class didnt
include the method defined in the Formatter interface. I have updated the
Javadocs to correct this oversight.

To correct your problem either make your class implement the Formatter
interface to perform your choice of custom formatting or remove the this
parameter from your call to create a new Highlighter with the default
Formatter implementation.

Thanks for highlighting the problem with the Javadocs...

Cheers
Mark


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: org.apache.lucene.search.highlight.Highlighter

2004-05-21 Thread Claude Devarenne
Hi,

Here is the documentation Mark Harwood included in the original package.  I followed his directorions and it worked for me.  Let me know if this doesn't do it for you.

Claude



On May 21, 2004, at 4:29 AM, Karthik N S wrote:

Hi

 Please can some body give me a simple Example of

 org.apache.lucene.search.highlight.Highlighter

 I am trying to use it but unsucessfull

 

Karthik






















image.tiff>
WITH WARM REGARDS 
HAVE A NICE DAY 
[ N.S.KARTHIK] 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: org.apache.lucene.search.highlight.Highlighter

2004-05-21 Thread Claude Devarenne
Arrgh the attachment didn't make it here it goes, sorry:
//perform a standard lucene query
searcher = new IndexSearcher(ramDir);
Analyzer analyzer=new StandardAnalyzer();
Query query = QueryParser.parse(Kenne*, FIELD_NAME, analyzer);
query=query.rewrite(reader); //necessary to expand search terms
Hits hits = searcher.search(query);
//create an instance of the highlighter with the tags used to  
surround highlighted text
QueryHighlightExtractor highlighter =
new QueryHighlightExtractor(query, new  
StandardAnalyzer(), b, /b);

for (int i = 0; i  hits.length(); i++)
{
String text = hits.doc(i).get(FIELD_NAME);
//call to highlight text with chosen tags
String highlightedText =  
highlighter.highlightText(text);
System.out.println(highlightedText);
}

If your documents are large you can select only the best fragments from  
each document like this:
//...as above example

int highlightFragmentSizeInBytes = 80;
int maxNumFragmentsRequired = 4;
String fragmentSeparator=...;
for (int i = 0; i  hits.length(); i++)
{
String text = hits.doc(i).get(FIELD_NAME);
String highlightedText =  
highlighter.getBestFragments(text,
 
highlightFragmentSizeInBytes,maxNumFragmentsRequired,fragmentSeparator);
System.out.println(highlightedText);
}

On May 21, 2004, at 9:22 AM, Claude Devarenne wrote:
Hi,
Here is the documentation Mark Harwood included in the original  
package.  I followed his directorions and it worked for me.  Let me  
know if this doesn't do it for you.

Claude

On May 21, 2004, at 4:29 AM, Karthik N S wrote:

Hi
 Please can some body give me a simple Example of
 org.apache.lucene.search.highlight.Highlighter
 I am trying to use it but unsucessfull
 
Karthik










image.tiff
WITH WARM REGARDS
HAVE A NICE DAY
[ N.S.KARTHIK]
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: org.apache.lucene.search.highlight.Highlighter

2004-05-21 Thread markharw00d
Hi Claude, that example code you provided is out of date.

For all concerned - the highlighter code was refactored about a month ago and then 
moved into the Sandbox.

Want the latest version? - get the latest code from the sandbox CVS.
Want the latest docs? - Run javadoc on the above.

There is a basic example of highlighter use in the package-level javadocs and more 
extensive examples 
in the JUnit test that accompanies the source code.

Hope this helps clarify things.

Mark

ps Bruce, I know you were interested in providing an alternative Fragmenter 
implementation 
for the highlighter that detects sentence boundaries.
You may want to look at LingPipe which has a heuristic sentence boundary detector.
( http://threattracker.com:8080/lingpipe-demo/demo.html )
I took a quick look at it but it has its own tokenizer that would be difficult to make 
work with 
the tokenstream used to identify query terms. At least the code gives some examples of 
the
heuristics involved in detecting sentence boundaries. For my own apps I find the 
standard Fragmenter
implementation suffices.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: org.apache.lucene.search.highlight.Highlighter

2004-05-19 Thread Bruce Ritchie

 Thanks for highlighting the problem with the Javadocs...

Groan. :)


Regards,

Bruce Ritchie


smime.p7s
Description: S/MIME cryptographic signature


Re: org.apache.lucene.search.highlight.Highlighter

2004-05-19 Thread markharw00d
Was Investigating,found some Compile time error..
 
I see the code you have is taken from the example in the javadocs. Unfortunately that 
example wasn't complete because the class didnt
include the method defined in the Formatter interface. I have updated the Javadocs to 
correct this oversight.

To correct your problem either make your class implement the Formatter interface to 
perform your choice of custom formatting or remove the this 
parameter from your call to create a new Highlighter with the default Formatter 
implementation.

Thanks for highlighting the problem with the Javadocs...

Cheers
Mark


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]