date:20050414

Highlighter for CJK ??

2005-04-14 Thread Eric Chow

Hello, Is any any good Highlighter for Asian languages (Chinese, Japanese, Koreanese) Eric - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Highlighter for CJK ??

2005-04-14 Thread Che Dong

Here is a demo: http://grassland.cnblog.org/ Che Dong Eric Chow åé: Hello, Is any any good Highlighter for Asian languages (Chinese, Japanese, Koreanese) Eric - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-m

Re: Highlighter for CJK ??

2005-04-14 Thread mark harwood

Hi Eric, I haven't tested it personally, but I have had reports that it works OK with CJKAnalyzer. This was reported after I added support for overlapping tokens in tokenstreams last July. Cheers, Mark --- Eric Chow <[EMAIL PROTECTED]> wrote: > Hello, > > Is any any good Highlighter for Asian

Re: Searching an NTFS File Server

2005-04-14 Thread John Haxby

Maher Martin wrote: * The user's access rights would be read from Active Directory (i.e windows group membership, etc) * On the submission of a query to Lucene - the user / group access rights would be appended as required search criteria and Lucene would filter out all results that the user should

Update performance/indexwriter.delete()?

2005-04-14 Thread Roy Klein

I've got an application that will be doing constant updates to an index. I've looked into batching those updates, however, based on the way the application works, the updates can't be batched. (Well, I figure with a lot of work, I might be able to batch ~10% of the transactions) Another requiremen

Re: Strange sort error

2005-04-14 Thread Yonik Seeley

I haven't tried it, but I think the fix should be easy... never throw that exception. Either check for null before the loop, or in the loop. Original code for native int sorting: TermEnum termEnum = reader.terms (new Term (field, "")); try { if (termEnum.term() == null)

IOException: No such path or directory

2005-04-14 Thread Luis Medina

Hi Everyone, The company I work for uses Lucene search 2 of their sites. Each site's configuration is (almost) an mirror image of the other. The only difference here is the content. We use a servlet to start up a Lucene mantainance utility that keeps the indexes up to date. This servlet is set to

RE: Update performance/indexwriter.delete()?

2005-04-14 Thread Peter Veentjer - Anchor Men

-Oorspronkelijk bericht- Van: Roy Klein [mailto:[EMAIL PROTECTED] Verzonden: donderdag 14 april 2005 15:40 Aan: java-user@lucene.apache.org Onderwerp: Update performance/indexwriter.delete()? >>I've got an application that will be doing >>constant updates to an index. >>I've looked i

getting the number of occurrences within a document

2005-04-14 Thread Pablo Gomes Ludermir

Hello all, I would like to get the following information from the index: 1. Given a term, how many times the term occurs in each document. Something like a triple: < Term, Doc1, Freq> , , , ... Is possible to do that? Regards, Pablo -- Pablo Gomes Ludermir [EMAIL PROTECTED]

Re: Strange sort error

2005-04-14 Thread Yonik Seeley

> if (termEnum==null || term.field() != field) break; // CHANGE > here Errr, that should be term==null of course. > if (term==null || term.field() != field) break; // CHANGE here And it *may* be slightly speedier to check for null just before the do/while loop instead:

RE: getting the number of occurrences within a document

2005-04-14 Thread Pasha Bizhan

Hi, > From: Pablo Gomes Ludermir [mailto:[EMAIL PROTECTED] > I would like to get the following information from the index: > > 1. Given a term, how many times the term occurs in each document. > Something like a triple: > < Term, Doc1, Freq> , , , ... > > Is possible to do that? See IndexRead

Re: getting the number of occurrences within a document

2005-04-14 Thread Paul Libbrecht

Le 14 avr. 05, à 17:15, Pablo Gomes Ludermir a écrit : I would like to get the following information from the index: 1. Given a term, how many times the term occurs in each document. Something like a triple: < Term, Doc1, Freq> , , , ... Is possible to do that? Luke did this to my index with good s

Reverting QueryParser ?

2005-04-14 Thread Paul Libbrecht

Hi, I am currently evaluating the need for an elaborate query data-structure (to be exchanged over XML-RPC) as opposed to working with plain strings. One thing that would heavily vote for strings would be to have query objects returned by Query-parser reconvertible to a string (and bac

Re: getting the number of occurrences within a document

2005-04-14 Thread Andy Roberts

On Thursday 14 Apr 2005 15:15, Pablo Gomes Ludermir wrote: > Hello all, > > I would like to get the following information from the index: > > 1. Given a term, how many times the term occurs in each document. > Something like a triple: > < Term, Doc1, Freq> , , , ... > > Is possible to do that? > >

Re: Hungarian notation analyzer and phrase queries

2005-04-14 Thread Doug Cutting

Paul Smith wrote: So it sounds like there isn't a perfect solution, but I think the best tradeoff for me is to put them all in the same position unless anyone has more input on the subject? If they're all at the same position you can still use slop to match the phrase. So if 'power', 'query'

Re: Update performance/indexwriter.delete()?

2005-04-14 Thread Doug Cutting

Roy Klein wrote: So one thing I've been wondering: Why do you need to do deletes from an indexreader? Is this not in the FAQ? It should be... IndexWriter can only append documents to an index. An IndexReader is required to, given a term, find the document number to mark deleted. Also, in the cu

Re: Reverting QueryParser ?

2005-04-14 Thread Erik Hatcher

On Apr 14, 2005, at 11:32 AM, Paul Libbrecht wrote: Hi, I am currently evaluating the need for an elaborate query data-structure (to be exchanged over XML-RPC) as opposed to working with plain strings. One thing that would heavily vote for strings would be to have query objects returne

Re: Update performance/indexwriter.delete()?

2005-04-14 Thread Yonik Seeley

> An IndexReader is required to, given a term, find the document number to > mark deleted. Yeah, most the time it makes sense to do deletions off the IndexReader. There are times, however, when it would be nice for deletes to be able to be concurrent with adds. Q: can docids change after an add(

Re: Reverting QueryParser ?

2005-04-14 Thread Doug Cutting

Paul Libbrecht wrote: I am currently evaluating the need for an elaborate query data-structure (to be exchanged over XML-RPC) as opposed to working with plain strings. I'd opt for both. For example: "java based" -coffee site apache.org d

Re: Update performance/indexwriter.delete()?

2005-04-14 Thread Doug Cutting

Yonik Seeley wrote: There are times, however, when it would be nice for deletes to be able to be concurrent with adds. It would also be nice if good coffee was free. Q: can docids change after an add() (with merging segments going on behind the scenes) or is optimize() the only call that ends up ch

Re: Reverting QueryParser ?

2005-04-14 Thread Pierrick Brihaye

Hi, Erik Hatcher a écrit : No, this hasn't been done except for the basic Query.toString() output which for the most part is parsable again. The question is, what do you do about the analysis process? It's a one-way transformation - and parsing again may not yield the same query. We (the SDX de

RE: Update performance/indexwriter.delete()?

2005-04-14 Thread Roy Klein

Hi, I guess I didn't ask my question very well. I do understand that you can only do a delete via a reader based on the current sources, what I don't understand is why the delete function couldn't be incorporated into a writer, so that updates could be all done within the context of a writer? Fo

Re: Update performance/indexwriter.delete()?

2005-04-14 Thread Doug Cutting

Roy Klein wrote: I think this is a better way of asking my original questions: "Why was this designed this way?" In order to optimize updates. "Can it be changed to optimize updates?" Updates are fastest when additions and deletions are separately batched. That is the design. Doug -

Re: IOException: No such path or directory

2005-04-14 Thread Daniel Naber

On Thursday 14 April 2005 16:44, Luis Medina wrote: > primarily reporting lock issues (except no lock files > were found in the directory). With "that directory", do you mean the index directory? The lock files are not there, but in /tmp (by default). It's only okay to remove the lock file manu

Boosting not working?

2005-04-14 Thread Martin May

I have a bunch of documents in my index, some of which have values for a certain field while others don't. I'd like the ones that do have a value to always show up before the ones who don't when sorting by relevance. I tried to accomplish this by check whether there are values for the field, and

Re: Boosting not working?

2005-04-14 Thread Otis Gospodnetic

I'd look a the output of Explain to see how ranking score is calculated Look at this: http://lucenebook.com/search?query=explain (hit #1 is from a free chapter) Otis --- Martin May <[EMAIL PROTECTED]> wrote: > > I have a bunch of documents in my index, some of which have values > for a > certa

Re: Boosting not working?

2005-04-14 Thread Martin May

I've got the book (which is great, btw). I used Luke to get explanations of the results, but I don't see any boosts in the explanations. Martin On Thu, 2005-04-14 at 13:24 -0700, Otis Gospodnetic wrote: > I'd look a the output of Explain to see how ranking score is calculated > > Look at this:

Re: Boosting not working?

2005-04-14 Thread Erik Hatcher

On Apr 14, 2005, at 4:32 PM, Martin May wrote: I've got the book (which is great, btw). I used Luke to get explanations of the results, but I don't see any boosts in the explanations. The index-time boosts are folded into the field normalization factor, so you won't see boost by itself. That fie

Re: Strange sort error

2005-04-14 Thread Daniel Naber

On Thursday 14 April 2005 16:28, Yonik Seeley wrote: > I haven't tried it, but I think the fix should be easy... never throw > that exception. As Lucene does not have the concept of a "warning" I think it should throw exceptions when someone tries to do something that doesn't make sense (even i

Re: Strange sort error

2005-04-14 Thread Yonik Seeley

Hmmm, that's a great lucene architecture question. Should one be allowed to sort on a field that doesn't exist? One *can* query on fields that don't exist (and that's correct in my view). The thing is, lucene field creation is lazy... just because the field doesn't exist now doesn't mean that it w

Re: Strange sort error

2005-04-14 Thread Yonik Seeley

Also, it's more flexible. You can easily implement stricter checking on top of a "lax" model (use a term enumerator to see if the field exists before you call search), but not vice versa. -Yonik On 4/14/05, Yonik Seeley <[EMAIL PROTECTED]> wrote: > Hmmm, that's a great lucene architecture questi

Zilverline Search Engine version 1.2.0 released

2005-04-14 Thread Zilverline info

All, I've just released Zilverline version 1.2.0. This version is fully webbased, all settings, collections, preferences can be set via the web interface. You don't need to edit any config files anymore. Also I'm adding Powerpoint and Excel Extractors. The source will be made available as well ve

Re: Strange sort error

2005-04-14 Thread Chris Hostetter

: one is sorting on doesn't even have to exist in all the documents. I : think it would be even more confusing for an invalid query suddenly : becoming a valid query in the future just because someone added a doc Or worse, a query that does work today, stops working tomorow because one doc was r

RE: Update performance/indexwriter.delete()?

2005-04-14 Thread Chris Hostetter

You mentioned before that you can't "batch" your updates ... i can understand not being able to batch updates by number of updates -- but why can't you batch by time? It may sound bad to only process updates once an hour, or once every half hour, or once every 5 minutes, or even once every 30 sec

Atomic updates on Lucene index document?

2005-04-14 Thread Terence Lai

Hi all, As far as I know, I don't find any Lucene API for updating an index document. What I have to do is to delete the existing index document and insert a new one. However, this is going to be 2 separate operations (delete and update). If the first operation suceeds while the second operatio

Highlighter for CJK ??

Re: Highlighter for CJK ??

Re: Highlighter for CJK ??

Re: Searching an NTFS File Server

Update performance/indexwriter.delete()?

Re: Strange sort error

IOException: No such path or directory

RE: Update performance/indexwriter.delete()?

getting the number of occurrences within a document

Re: Strange sort error

RE: getting the number of occurrences within a document

Re: getting the number of occurrences within a document

Reverting QueryParser ?

Re: getting the number of occurrences within a document

Re: Hungarian notation analyzer and phrase queries

Re: Update performance/indexwriter.delete()?

Re: Reverting QueryParser ?

Re: Update performance/indexwriter.delete()?

Re: Reverting QueryParser ?

Re: Update performance/indexwriter.delete()?

Re: Reverting QueryParser ?

RE: Update performance/indexwriter.delete()?

Re: Update performance/indexwriter.delete()?

Re: IOException: No such path or directory

Boosting not working?

Re: Boosting not working?

Re: Boosting not working?

Re: Boosting not working?

Re: Strange sort error

Re: Strange sort error

Re: Strange sort error

Zilverline Search Engine version 1.2.0 released

Re: Strange sort error

RE: Update performance/indexwriter.delete()?

Atomic updates on Lucene index document?

35 matches

Site Navigation

Mail list logo

Footer information