RE: Highlighting query results, my method is too crude, but how to improve it?

2023-02-21 Thread Trevor Nicholls
Thank you David, very useful cheers T -Original Message- From: Dawid Weiss Sent: Tuesday, February 21, 2023 7:17 PM To: java-user@lucene.apache.org Subject: Re: Highlighting query results, my method is too crude, but how to improve it? You can use two different queries - the query

Re: Highlighting query results, my method is too crude, but how to improve it?

2023-02-20 Thread Dawid Weiss
g. > > cheers > T > > -Original Message- > From: Mikhail Khludnev > Sent: Tuesday, February 21, 2023 3:22 AM > To: java-user@lucene.apache.org > Subject: Re: Highlighting query results, my method is too crude, but how > to improve it? > > Hello,

RE: Highlighting query results, my method is too crude, but how to improve it?

2023-02-20 Thread Trevor Nicholls
ut I might be missing something. cheers T -Original Message- From: Mikhail Khludnev Sent: Tuesday, February 21, 2023 3:22 AM To: java-user@lucene.apache.org Subject: Re: Highlighting query results, my method is too crude, but how to improve it? Hello, Maybe I'm missing some point. Bu

Re: Highlighting query results, my method is too crude, but how to improve it?

2023-02-20 Thread Mikhail Khludnev
Hello, Maybe I'm missing some point. But, can you highlight another query than one you search for? On Mon, Feb 20, 2023 at 5:07 PM Trevor Nicholls wrote: > Sorry I apologize for this being a bit long and for explaining the problem > at the very bottom after all the background, rather than

Highlighting query results, my method is too crude, but how to improve it?

2023-02-20 Thread Trevor Nicholls
Sorry I apologize for this being a bit long and for explaining the problem at the very bottom after all the background, rather than starting with it at the top. I thought it was easier to explain like this, please bear with me! So I've indexed a library of technical documentation, and the

Re: [External] Streaming documents into the index breaks highlighting

2022-11-17 Thread Shifflett, David [USA]
Just to clarify, Is there a highlighting option that doesn't require the text from the matched document? David Shifflett On 11/17/22, 1:57 PM, "Shifflett, David [USA]" wrote: Hi, I am converting my application from reading documents into memory, then indexing the

Streaming documents into the index breaks highlighting

2022-11-17 Thread Shifflett, David [USA]
Hi, I am converting my application from reading documents into memory, then indexing the documents to streaming the documents to be indexed. I quickly found out this required that the field NOT be stored. I then quickly found out that my highlighting code requires the field to be stored. I’ve

Lucene Highlighting mergeContiguous

2020-07-21 Thread Patricia Reddy
Hello All, Trying to highlight a phrase "John Doe" using Lucene highlighter but the content highlights each separate term. Contiguous terms are not merged together. For eg: JohnDoe is returned instead of John Doe I have set the mergeContiguous parameter on the getBestTextFragments method to

Re: Highlighting and delineating Passages (fragmenting)

2017-05-30 Thread Dawid Weiss
rn to this later on, depending on how the project progresses -- if so, I'd love to somehow help make the "default" highlighting better (or easier to use). Dawid - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For a

Re: Highlighting and delineating Passages (fragmenting)

2017-05-30 Thread David Smiley
aries. You could file a patch or just a feature request in JIRA. > > With the original Highlighter, you can easily do this by providing the > > stored text. When you create the QueryScorer, use "null" for field name > to > > highlight all query fields. The UH can do thi

Re: Highlighting and delineating Passages (fragmenting)

2017-05-30 Thread Dawid Weiss
ox in Solr because I didn't want to add custom code to the distribution/core. Sigh. > With the original Highlighter, you can easily do this by providing the > stored text. When you create the QueryScorer, use "null" for field name to > highlight all query fields. The UH can do

Re: Highlighting and delineating Passages (fragmenting)

2017-05-30 Thread David Smiley
r, you can easily do this by providing the stored text. When you create the QueryScorer, use "null" for field name to highlight all query fields. The UH can do this as well by highlighting the fields that are stored, and call setFieldMatcher to provide a Predicate that always return

Re: Highlighting and delineating Passages (fragmenting)

2017-05-27 Thread Evert Wagenaar
I always assumed this was the default behaviour of the Lucene TermHighlighter but I could be mistaken with an older version. I found out that there are major differences between Lucene and Solr though, with which I have similar problems. Best regards, Evert Wagenaar

Re: Highlighting and delineating Passages (fragmenting)

2017-05-27 Thread Dawid Weiss
Thanks for your explanation, David. I actually found working with all Lucene highlighters pretty difficult. I have a few requirements which seemed deceptively simple: 1) highlight query hit regions (phrase, fuzzy, terms); 2) try to organise the resulting snippets to visually "center" the hit

Highlighting and delineating Passages (fragmenting)

2017-05-26 Thread David Smiley
I was recently asked if/how the UnifiedHighlighter can return a Passage centered around the highlighted words. I'm responding to a wider audience (java-user list, ...). Each highlighter implementation fragments the content into passages (with highlights) using a different algorithm. The

Re: highlighting with best text fragment from multi-value field

2016-12-14 Thread Tech Behemoth
Hi all Any idea of best practice for getting fragmented highlighted string ( Lucene 5.3.2) of multi-value field? Thanks On Mon, Dec 12, 2016 at 12:11 AM, Tech Behemoth <tech.behem...@gmail.com> wrote: > Hi all > > How to provide highlighting for fragmented string which

highlighting with best text fragment from multi-value field

2016-12-12 Thread Tech Behemoth
Hi all How to provide highlighting for fragmented string which is created from multi-value field using Lucene 5.3.2 ? Is any known solution for it? 1. Or first approach - merge all multi-value into one single value and apply highlighter.getBestTextFragments(tokenStream, text, false

Prefix-Queries and Syntax Highlighting

2016-01-08 Thread Nils Knappmeier
Good morning, we currently use Lucene 4.3 in our project. We automatically generate PrefixQueries and we are passing the rewritten query to the Highlighter to highlight search terms in the search result. Up until a few days ago, we were using a

Re: range query highlighting

2015-12-23 Thread will martin
Todd: "This trick just converts the multi term queries like PrefixQuery or RangeQuery to boolean query by expanding the terms using index reader." http://stackoverflow.com/questions/7662829/lucene-net-range-queries-highlighting beware cost. (my comment) g’luck will > On Dec 2

range query highlighting

2015-12-23 Thread Fielder, Todd Patrick
I have a NumericRangeQuery and a TermQuery that I am combining into a Boolean query. I would then like to pass the Boolean query to the highlighter to highlight both the range and term hits. Currently, only the terms are being highlighted. Any help on how to get the range values to highlight

RE: Highlighting deprecation?

2015-12-02 Thread Allison, Timothy B.
/%3CBY2PR09MB112F5004CD269FA936812DFC72B0%40BY2PR09MB112.namprd09.prod.outlook.com%3E -Original Message- From: scott cote [mailto:scottcc...@gmail.com] Sent: Tuesday, December 01, 2015 5:26 PM To: java-user@lucene.apache.org Subject: Re: Highlighting deprecation? checkout the highlight

Highlighting deprecation?

2015-12-01 Thread Kunzman, Douglas *
Hi - I would like to thank everyone for their help in understanding Terms in Lucene over the weekend. After checking the documentation for a day or two I have another quick question. I'm trying to highlight all of the found terms in the document before displaying it and have something working

Re: Highlighting deprecation?

2015-12-01 Thread scott cote
checkout the highlight package … https://lucene.apache.org/core/5_3_0/highlighter/org/apache/lucene/search/highlight/package-summary.html SCott > On Dec 1, 2015, at 4:16 PM, Kunzman,

phrase query, stop words, and highlighting?

2014-09-22 Thread Rob Nikander
Hi, I just noticed that a search like rooms to go is failing to highlight. (I use FastVectorHighlighter). I know it's caused the stop word (to). Is there a recommended way to fix this? I may just re-index without stop words, and see if that causes any problems. thanks, Rob

2.9.2 Memory issue 8.0GB or more / OOM with Term / Highlighting

2014-07-30 Thread Baldwin, David
I am looking to track down an issue in 2.9.2 where during highlighting, certain data may cause rapid memory usage and OOM exception in java: --- java.lang.OutOfMemoryError: Java heap space at org.apache.lucene.analysis.Token.growTermBuffer(Token.java:470

Re: Using Sentence Information for Lucene Highlighting

2014-04-12 Thread Furkan KAMACI
Hi; I could find a way to achieve it when I debugged the source code. I've shared the same information at Solr mail list too. Defining a delimiter and indexing it as an individual token is the first step. Writing a regex that matches for given delimiter is the next step. Last step is defining the

Using Sentence Information for Lucene Highlighting

2014-04-08 Thread Furkan KAMACI
Hi; I could not get an answer for my question at Solr list and I wanted to ask it here because I think that it is more Lucene specific question. I have indexed my documents and there is a special character sequence that shows the end of a string. It is: *|* For example: The quick brown fox

Re: Highlighting text, do I seriously have to reimplement this from scratch?

2014-02-05 Thread Trejkaz
On Wed, Feb 5, 2014 at 4:16 AM, Earl Hood e...@earlhood.com wrote: Our current solution is to do highlighting on the client-side. When search happens, the search results from the server includes the parsed query terms so the client has an idea of which terms to highlight vs trying

Re: Highlighting text, do I seriously have to reimplement this from scratch?

2014-02-05 Thread Olivier Binda
cite, doing a quick look at it, it looks like I may have to modify highlighting code behavior to support how the project I am transforms the XML data. Example: we deal with attribute data that gets transformed to render content in the HTML served to the client, and the highlighting code cited does

Re: Highlighting text, do I seriously have to reimplement this from scratch?

2014-02-05 Thread Earl Hood
On Tue, Feb 4, 2014 at 6:05 PM, Michael Sokolov wrote: Thanks for the feedback. I think it's difficult to know what to do about attribute value highlighting in the general case - do you have any suggestions? That is a challenging one since one has to know how attribute data

RE: Highlighting text, do I seriously have to reimplement this from scratch?

2014-02-04 Thread Allison, Timothy B.
meet your users' expectations (breaking books into chapters etc.). In the highlighting/concordance realm, I've found that Lucene is still totally fast enough for my needs on large texts, but that it is far faster on lots of small docs vs fewer large docs. Best Tim -Original Message

Re: Highlighting text, do I seriously have to reimplement this from scratch?

2014-02-04 Thread Earl Hood
On Tue, Feb 4, 2014 at 12:20 AM, Trejkaz wrote: I'm trying to find a precise and reasonably efficient way to highlight all occurrences of terms in the query, only highlighting fields which match the corresponding fields used in the query. This seems like it would be a fairly common

Re: Highlighting text, do I seriously have to reimplement this from scratch?

2014-02-04 Thread Michael Sokolov
On 2/4/14 12:16 PM, Earl Hood wrote: On Tue, Feb 4, 2014 at 12:20 AM, Trejkaz wrote: I'm trying to find a precise and reasonably efficient way to highlight all occurrences of terms in the query, only highlighting fields which ... [snip] I am in a similiar situation with a web-based

Re: Highlighting text, do I seriously have to reimplement this from scratch?

2014-02-04 Thread Earl Hood
/XmlHighlighter.java I am aware of Lux, but moving to use it would be a major redesign effort for the project I am on, something that likely would not get management approval. BTW, just within the scope of the class you cite, doing a quick look at it, it looks like I may have to modify highlighting code

Re: Highlighting text, do I seriously have to reimplement this from scratch?

2014-02-04 Thread Michael Sokolov
have to modify highlighting code behavior to support how the project I am transforms the XML data. Example: we deal with attribute data that gets transformed to render content in the HTML served to the client, and the highlighting code cited does not appear to handle XML attributes. There are other

Highlighting text, do I seriously have to reimplement this from scratch?

2014-02-03 Thread Trejkaz
Hi all. I'm trying to find a precise and reasonably efficient way to highlight all occurrences of terms in the query, only highlighting fields which match the corresponding fields used in the query. This seems like it would be a fairly common requirement in applications. We have an existing

Highlighting phrases

2013-11-27 Thread Scott Smith
I'm doing some highlighting with the following code fragment: formatter = new SimpleHTMLFormatter(b, /b); Scorer score = new QueryScorer(myQuery); ht = new Highlighter(formatter, score); ht.setTextFragmenter(new NullFragmenter

RE: Highlighting phrases

2013-11-27 Thread Scott Smith
Never mind. I figured it out. Thanks anyway. -Original Message- From: Scott Smith [mailto:ssm...@mainstreamdata.com] Sent: Wednesday, November 27, 2013 9:27 AM To: java-user@lucene.apache.org Subject: Highlighting phrases I'm doing some highlighting with the following code fragment

Re: Regarding Lucene Highlighting feature.

2013-07-10 Thread VIGNESH S
Hi Robert, Thanks for the reply. My Actual Usecase is to Highlight the First occurence of the search word in the sentence it occured. In my case,I do not have access to original documents . Iam looking for optimum way by which i need to reduce the index disk space. I tried SimpleHighlighter

Re: Regarding Lucene Highlighting feature.

2013-07-05 Thread VIGNESH S
: Hi, Is it mandatory to use Store.YES when using Highlighting Feature. is it Possible to use Highlighting Feature without using Store.Yes while indexing because it almost doubles index size. Please Kindly Help. -- Thanks and Regards Vignesh Srinivasan 9739135640 -- Thanks

Re: Regarding Lucene Highlighting feature.

2013-07-05 Thread Roberto Ragusa
On 07/05/2013 01:27 PM, VIGNESH S wrote: Hi, I think using CompressingStoredFieldsFormat Feature introduced in Lucene 4.1 may help reduce index size. Any other comments and suggestions are welcome in this topic.. Do you have access to the original documents, outside Lucene? If so, you

Regarding Lucene Highlighting feature.

2013-07-04 Thread VIGNESH S
Hi, Is it mandatory to use Store.YES when using Highlighting Feature. is it Possible to use Highlighting Feature without using Store.Yes while indexing because it almost doubles index size. Please Kindly Help. -- Thanks and Regards Vignesh Srinivasan 9739135640

highlighting component to searchComponent

2013-07-01 Thread Adrien RUFFIE
Hello all I had the following configuration in my solrconfig.xml : !-- highlighting -- !--fragmenter name=gap class=org.apache.solr.highlight.GapFragmenter default=true -- !--lst name=defaults -- !-- int name=hl.fragsize100/int -- !--/lst -- !--/fragmenter

Re: highlighting component to searchComponent

2013-07-01 Thread Jack Krupansky
Try asking your question on the “Solr user” email list – this is the Lucene user list! -- Jack Krupansky From: Adrien RUFFIE Sent: Monday, July 01, 2013 4:36 AM To: java-user@lucene.apache.org Subject: highlighting component to searchComponent Hello all I had the following configuration

Re: Highlighting search words in full document

2013-04-08 Thread Darren Hoffman
Thanks, Erick. I'll try that. Darren On 2013-04-07 3:25 PM, Erick Erickson erickerick...@gmail.com wrote: Well, at that point you have a doc ID presumably. When you format your responses to the initial query, the link you provide for each verse is something like

Re: Highlighting search words in full document

2013-04-07 Thread Erick Erickson
Sounds like what you want to do is 1 with each verse, store the chapter ID. This could be the ID of another document. There's no requirement that all docs in an index have the same structure. In this case, you could have a type field in each doc with values like verse and chapter. For your verse

Re: Highlighting search words in full document

2013-04-07 Thread Darren Hoffman
Thanks for the response, Erick. I have implemented just what you have described. The question I have is how to highlight the searched words in the entire chapter that were highlighted in the selected verse. Thanks! Sent from my iPhone On Apr 7, 2013, at 5:38 AM, Erick Erickson

Re: Highlighting search words in full document

2013-04-07 Thread Erick Erickson
Well, at that point you have a doc ID presumably. When you format your responses to the initial query, the link you provide for each verse is something like yourserver/solr/collection1/select?q=id:chapter_idhl=truehl.fl=fullchaptertexthl.q=original query. So when the user clicks on it, you get a

Highlighting search words in full document

2013-04-06 Thread Darren Hoffman
I am creating a Bible search app that indexes each verse of the bible as a separate document. When a user selects a verse from search results, I am wanting to show an entire chapter of the Bible with the search words highlighted. I'm using the FastVectorHighlighter and would like to know the best

Re: Highlighting and InvalidTokenOffsetsException in Lucene 4.0

2012-11-28 Thread nbuso
Scott Smith ssmith at mainstreamdata.com writes: I'm migrating code from Lucene 3.5 to 4.0. I have the following code which is supposed to highlight text. I get the exception InvalidTokenOffsetsException. I have no idea what that means. I am using a custom analyzer which seems to work

Re: Highlighting html pages

2012-11-06 Thread Steve Rowe
Hi Scott, HTMLStripCharFilter doesn't require that its input be valid HTML - there is no assumption of balanced tags. Also, highlighted sections could span tags, e.g. if you highlight this phrase, and the original HTML looks like: … thisspanphrase/span … the highlighting code would

Re: Highlighting html pages

2012-11-06 Thread Michael Sokolov
: … thisspanphrase/span … the highlighting code would have to know to put multiple tags to avoid non-wellformedness, maybe something like: … bthis/bspanbphrase/b/span … If you do develop a solution here, it would be great if you could share it with the community. Also, I think it would be useful

Re: Highlighting html pages

2012-11-05 Thread Michael Sokolov
after I've stripped the HTML. Suggestions? Scott -Original Message- From: Michael Sokolov [mailto:soko...@ifactory.com] Sent: Tuesday, October 23, 2012 9:04 PM To: java-user@lucene.apache.org Cc: Scott Smith Subject: Re: Highlighting html pages If you use HTMLStripCharFilter, it extracts

RE: Highlighting html pages

2012-11-05 Thread Scott Smith
to be aware of when using the HTMLStripCharFilter and then highlighting search terms. Assume you strip the html characters with the HTMLStripCharFilter and then use the standard tokenizer. Now you run it through the highlighter. If there were other html tags (besides whatever you are using

RE: Highlighting html pages

2012-11-01 Thread Scott Smith
] Sent: Tuesday, October 23, 2012 9:04 PM To: java-user@lucene.apache.org Cc: Scott Smith Subject: Re: Highlighting html pages If you use HTMLStripCharFilter, it extracts the text only, leaving tags out, and remembering the word positions so that highlighting works properly. Should do exactly what

RE: WFST/Analyzing Suggesters: foreign keys, user-supplied filter, highlighting

2012-10-31 Thread Oliver Christ
Hi, I've added LUCENE-4516 - Suggesters: allow to associate a user-specified key (int) with a string LUCENE-4517 - Suggesters: allow to pass a user-defined predicate/filter to the completion searcher LUCENE-4518 - Suggesters: highlighting (explicit markup of user-typed portions vs. generated

Re: WFST/Analyzing Suggesters: foreign keys, user-supplied filter, highlighting

2012-10-31 Thread Michael McCandless
predicate/filter to the completion searcher LUCENE-4518 - Suggesters: highlighting (explicit markup of user-typed portions vs. generated portions in a suggestion) Cheers, Oli -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Tuesday, October 30

WFST/Analyzing Suggesters: foreign keys, user-supplied filter, highlighting

2012-10-30 Thread Oliver Christ
books, but can't maintain separate WFSTs for each subject area. Given some completion candidate (represented by its key), the filter would be called (with the key as a parameter) to determine whether or not the completion candidate should be added to the result queue. * Highlighting

Re: WFST/Analyzing Suggesters: foreign keys, user-supplied filter, highlighting

2012-10-30 Thread Michael McCandless
until it finds topN that your filter accepts? Or alternatively you could pull a big topN and then filter yourself later, but that's less efficient... * Highlighting of the completed portions (i.e. explicit markup of user-provided vs. auto-completed portions of a completion). Hmm could you

Re: WFST/Analyzing Suggesters: foreign keys, user-supplied filter, highlighting

2012-10-30 Thread Dawid Weiss
https://issues.apache.org/jira/browse/LUCENE-4491 ? Could you simply stuff your ISBN onto the end of the suggestion (ie enroll Lucene in Action|1933988177)? Just remember that if your suffixes are unique then you'll be expanding the automaton quite a bit (unique suffix paths). D.

Re: WFST/Analyzing Suggesters: foreign keys, user-supplied filter, highlighting

2012-10-30 Thread Michael McCandless
On Tue, Oct 30, 2012 at 3:52 PM, Dawid Weiss dawid.we...@gmail.com wrote: https://issues.apache.org/jira/browse/LUCENE-4491 ? Could you simply stuff your ISBN onto the end of the suggestion (ie enroll Lucene in Action|1933988177)? Just remember that if your suffixes are unique then you'll

Highlighting html pages

2012-10-23 Thread Scott Smith
I need to take an html page that I retrieve from my lucene search and highlight all of the terms that are part of the search. I need to skip over any html tags since I don't want any words in tags which happen to match the search to be highlighted. Note that I don't want sections of the

Re: Highlighting html pages

2012-10-23 Thread Michael Sokolov
If you use HTMLStripCharFilter, it extracts the text only, leaving tags out, and remembering the word positions so that highlighting works properly. Should do exactly what you want out of the box... On 10/23/2012 8:00 PM, Scott Smith wrote: I need to take an html page that I retrieve from

HTML tags and Lucene highlighting

2012-04-05 Thread okayndc
Hello, I currently use Lucene version 3.0...probably need to upgrade to a more current version soon. The problem that I have is when I test search for a an HTML tag (ex. strong), Lucene returns the highlighted HTML tag ~ which is what I DO NOT want. Is there a way to filter HTML tags? I have

RE: HTML tags and Lucene highlighting

2012-04-05 Thread Steven A Rowe
Hi okayndc, What *do* you want? Steve -Original Message- From: okayndc [mailto:bodymo...@gmail.com] Sent: Thursday, April 05, 2012 1:34 PM To: java-user@lucene.apache.org Subject: HTML tags and Lucene highlighting Hello, I currently use Lucene version 3.0...probably need to upgrade

Re: HTML tags and Lucene highlighting

2012-04-05 Thread okayndc
okayndc, What *do* you want? Steve -Original Message- From: okayndc [mailto:bodymo...@gmail.com] Sent: Thursday, April 05, 2012 1:34 PM To: java-user@lucene.apache.org Subject: HTML tags and Lucene highlighting Hello, I currently use Lucene version 3.0...probably need to upgrade

RE: HTML tags and Lucene highlighting

2012-04-05 Thread Steven A Rowe
(in the field configured to use HTMLStripCharFilter, anyway). So HTMLStripCharFilter should do what you want. Steve From: okayndc [mailto:bodymo...@gmail.com] Sent: Thursday, April 05, 2012 3:36 PM To: Steven A Rowe Cc: java-user@lucene.apache.org Subject: Re: HTML tags and Lucene highlighting

Re: HTML tags and Lucene highlighting

2012-04-05 Thread okayndc
you want. Steve From: okayndc [mailto:bodymo...@gmail.com] Sent: Thursday, April 05, 2012 3:36 PM To: Steven A Rowe Cc: java-user@lucene.apache.org Subject: Re: HTML tags and Lucene highlighting Hello, I want to ignore HTML tags within a search. ~ I should not be able to search

Re: HTML tags and Lucene highlighting

2012-04-05 Thread Koji Sekiguchi
(12/04/06 2:34), okayndc wrote: Hello, I currently use Lucene version 3.0...probably need to upgrade to a more current version soon. The problem that I have is when I test search for a an HTML tag (ex. strong), Lucene returns the highlighted HTML tag ~ which is what I DO NOT want. Is there a

Hit Highlighting which highlighter to use?

2012-04-04 Thread Paul Hill
or span in the text if I added positions and offsets to text? When highlighting the small fields like title, path etc. should I add term vector with positions and offset and use FastVectorHighlighter or is it just not worth storing that extra information just for highlighting? -Paul

Highlighting in Luke?

2012-03-13 Thread Mike O'Leary
I sent this message to the Luke discussion forum, but there isn't a lot of activity there these days, so I thought I would ask my question here too. I was asked if Luke supports highlighting of matched terms in its search results display. I looked through the code, and it doesn't look to me

Highlighting does not work with PayloadTermQueries

2012-02-21 Thread Nitin Arora
Hi, I'm using SOLR and Lucene in my application for search. I'm facing an issue of highlighting using FastVectorHighlighter not working when I use PayloadTermQueries as clauses of a BooleanQuery. After Debugging I found that In DefaultSolrHighlighter.Java, fvh.getFieldQuery does not return any

Re: Lucene on Android: indexing, searching and highlighting

2011-11-28 Thread Ian Lea
As far as I'm aware recent versions of lucene, including the highlighter, should work out of the box. I'd guess that highlighting would be the most resource intensive and therefore troublesome bit. I'm not aware of any sample code showing lucene working on Android, but from my very limited

Lucene on Android: indexing, searching and highlighting

2011-11-23 Thread Ilya Zavorin
of pointers with each pair showing where a hit starts and ends, or something similar. I will be using Lucene 3.4.0. It also looks like for #4 I will need to use highlighting capabilities. My main question is whether I should expect any performance problems at any of the indexing/searching/highlighting

Need Help for Wild Card Query Highlighting

2011-10-18 Thread Vidya Kanigiluppai Sivasubramanian
Hi, I am new to lucene. I am using lucene 2.4.1 in my project to do a search in a text document. I need to perform a wild card query. I am using the code given in Hrycon - blog. It is working fine with complete words. When we do a wild card query, we can only see the search hits but the text

Re: Need Help for Wild Card Query Highlighting

2011-10-18 Thread Ian Lea
Why 2.4.1? That is ancient and there have been many improvements since then. Google finds hits for lucene highlight wild card some of which contain some solutions some of which may or may not be relevant for your problem. -- Ian. On Tue, Oct 18, 2011 at 8:17 AM, Vidya Kanigiluppai

RE: Need Help for Wild Card Query Highlighting

2011-10-18 Thread Vidya Kanigiluppai Sivasubramanian
Highlighting Why 2.4.1? That is ancient and there have been many improvements since then. Google finds hits for lucene highlight wild card some of which contain some solutions some of which may or may not be relevant for your problem. -- Ian. On Tue, Oct 18, 2011 at 8:17 AM, Vidya Kanigiluppai

Re: highlighting

2011-08-04 Thread govind bhardwaj
could try modifying the query to transport* like I did, but I got some error like this : * MemoryIndex class-not-found error (Exception in thread main java.lang.NoClassDefFoundError: org/apache/lucene/index/memory/MemoryIndex)* Also, regarding highlighting and regular expression, I found this bug (i'm

Re: highlighting

2011-07-18 Thread Sabeer Hussain
I am using Lucene 4.0 and trying to use its highlighting feature. I am not getting the desired result due to some mistake that I am not able to identify. My source code looks like String sourceText = liver disease kidney transplant; String termString =\transplant

Re: highlighting performance

2011-06-22 Thread Itamar Syn-Hershko
a large number of occurrences of a term that is in the query. In one example, I searched for 1, and this occurs just under 2000 times in one of my test documents (as the value of HTML attributes). Admittedly a weird case, but when this happens, the highlighting can take 300x longer than when

Re: highlighting performance

2011-06-21 Thread Michael Sokolov
I did that, and the benchmark indicates FVH is 10x faster than Highlighter now. I ran with a subset of the wikipedia data since I didn't want to deal with the whole thing. I'm trying to reconcile these weirdly varying results. One difference is that the benchmark doesn't use PhraseQueries -

Re: highlighting performance

2011-06-21 Thread Michael Sokolov
case, but when this happens, the highlighting can take 300x longer than when searching for a more distinctive term (like distinctive). I think there may be a problem here in that every term occurrence is compared against every other term occurrence (or every phrase within which the term may

highlighting performance

2011-06-20 Thread Mike Sokolov
Our apps use highlighting, and I expect that highlighting is an expensive operation since it requires processing the text of the documents, but I ran a test and was surprised just how expensive it is. I made a test index with three fields: path, modified, and contents. I made the index using

Re: highlighting performance

2011-06-20 Thread Koji Sekiguchi
://www.rondhuit.com/en/ (11/06/21 5:20), Mike Sokolov wrote: Our apps use highlighting, and I expect that highlighting is an expensive operation since it requires processing the text of the documents, but I ran a test and was surprised just how expensive it is. I made a test index with three fields: path

Re: highlighting performance

2011-06-20 Thread Michael Sokolov
Koji- I'm not familiar with the benchmarking system, but maybe I'll see if I can run that benchmark on my test data as a point of comparison - thanks for the pointer! -Mike On 6/20/2011 8:21 PM, Koji Sekiguchi wrote: Mike, FVH used to be faster for large docs. I wrote FVH section for Lucene

Re: PDF Highlighting using PDF Highlight File

2011-05-12 Thread Wulf Berschin
Well, AFAIS the Lucene Highlighters do not offer this functionality via their API, but could easily do. I think support for highlighting documents would be a very welcome feature. Highlighting HTML documents is already possible with the org.apache.solr.analysis.HTMLStripCharFilter

Re: PDF Highlighting using PDF Highlight File

2011-05-12 Thread Dawn Zoë Raison
On 12/05/2011 15:47, Wulf Berschin wrote: I think support for highlighting documents would be a very welcome feature. Highlighting HTML documents is already possible with the org.apache.solr.analysis.HTMLStripCharFilter and a NullFragmenter, but ther seems to be nothing for highlighting PDF

PDF Highlighting using PDF Highlight File

2011-05-10 Thread Wulf Berschin
Hi all, in our Lucene 3.0.3-based web application when a user clicks on a hit link the targeted PDF should be opened in the browser with highlighted hits. For this purpose using the Acrobat Highlight File (Parameter xml, see http://www.pdfbox.org/userguide/highlighting.html and

Re: The MoreLikeThisHandler could include highlighting ?

2011-05-03 Thread Koji Sekiguchi
(11/03/01 21:16), Amel Fraisse wrote: Hello, The MoreLikeThisHandler could include higlighting ? Is it true to define a MoreLikeThisHandler like this: ? requestHandler name=/mlt class=org.apache.solr.handler.MoreLikeThisHandler lst name=defaults bool

Re: Highlighting a phrase with Single SPAN

2011-04-06 Thread Koji Sekiguchi
(11/04/06 14:01), shrinath.m wrote: If there is a phrase in search, the highlighter highlights every word separately.. Like this : I love Lucene Instead what I want is like this : I love Lucene Not sure my mailer problem or not, I don't see the difference between above two. But reading

Re: Highlighting a phrase with Single SPAN

2011-04-06 Thread shrinath.m
://lucene.472066.n3.nabble.com/Highlighting-a-phrase-with-Single-SPAN-tp2783747p2784321.html To unsubscribe from Highlighting a phrase with Single SPAN, click herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=2783747code

Highlighting a phrase with Single SPAN

2011-04-05 Thread shrinath.m
, is there a way to ask the highlighter to do this ? -- View this message in context: http://lucene.472066.n3.nabble.com/Highlighting-a-phrase-with-Single-SPAN-tp2783747p2783747.html Sent from the Lucene - Java Users mailing list archive at Nabble.com.

The MoreLikeThisHandler could include highlighting ?

2011-03-01 Thread Amel Fraisse
Hello, The MoreLikeThisHandler could include higlighting ? Is it true to define a MoreLikeThisHandler like this: ? requestHandler name=/mlt class=org.apache.solr.handler.MoreLikeThisHandler lst name=defaults bool name=hltrue/bool str

Re: Preserving original HTML file offsets for highlighting

2011-01-26 Thread Karolina Bernat
, 2011 1:45 PM To: java-user@lucene.apache.org Subject: Re: Preserving original HTML file offsets for highlighting Hi Uwe, thanks for this hint. I'm not sure, how much of the Solr functionality do I need to implement for using the HTTPStripCharFilter. I'm using Apache Tika for HTML

Re: Preserving original HTML file offsets for highlighting

2011-01-25 Thread Karolina Bernat
need (highlighting of the hits within HTML files). Thank you so much for your help:-) Karo On Mon, Jan 24, 2011 at 2:03 PM, Karolina Bernat karolina.ber...@googlemail.com wrote: Hi all, I'm new to Lucene and have a question about indexing/highlighting of HTML files with Lucene. What I need

RE: Preserving original HTML file offsets for highlighting

2011-01-25 Thread Uwe Schindler
://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Karolina Bernat [mailto:karolina.ber...@googlemail.com] Sent: Tuesday, January 25, 2011 1:45 PM To: java-user@lucene.apache.org Subject: Re: Preserving original HTML file offsets for highlighting Hi Uwe, thanks

Preserving original HTML file offsets for highlighting

2011-01-24 Thread Karolina Bernat
Hi all, I'm new to Lucene and have a question about indexing/highlighting of HTML files with Lucene. What I need to do is highlight the hits (terms) in the original HTML file (or get the positions of the terms/tokens in the original file). This problem has already been described by Fred Toth

Re: Are there any tokenizers that ignore HTML tags but keep the offsets so they can be used for highlighting in the original document?

2010-06-08 Thread Hans Merkl
Hi Ahmet, I am using Lucene.NET with C# so I can't test this quickly. Will HTMLStripCharFilter maintain the character offsets or does it just extract the plain text? Hans You can use org.apache.solr.analysis.HTMLStripCharFilter. It is possible to add one or more

RE: Are there any tokenizers that ignore HTML tags but keep the offsets so they can be used for highlighting in the original document?

2010-06-08 Thread Uwe Schindler
Hi Ahmet, I am using Lucene.NET with C# so I can't test this quickly. Will HTMLStripCharFilter maintain the character offsets or does it just extract the plain text? Yes the CharFilter does this! Uwe - To unsubscribe,

Are there any tokenizers that ignore HTML tags but keep the offsets so they can be used for highlighting in the original document?

2010-06-07 Thread Hans Merkl
to write a tokenizer that strips out the HTML tags but maintains the original offsets within the HTML document so they can be used for highlighting the original HTML document, not just the text representation. Does anybody know any tokenizers that can do this? It seems it's something other people

  1   2   3   4   >