from:"Erik Hatcher"

Re: Lucene Search Applet

2004-08-23 Thread Erik Hatcher

On Aug 23, 2004, at 11:36 AM, Stephane James Vaucher wrote: Should this property be changed in the next major release of lucene to org.apache...disableLuceneLocks? Yes, that makes sense to put an org.apache.lucene prefix. If that is the case, it should be changed to disableLocks - no point in

Re: lucene and ejb applications

2004-08-20 Thread Erik Hatcher

What would be the best way? Use Lucene outside of EJB. It's quite silly to make such a decision purely due to a policy decision when the technicalities of it show that it is an unwise decision. You're going to navigate Hits through a session bean? And as you said, the EJB spec says not to

Re: lucene and ejb applications

2004-08-20 Thread Erik Hatcher

On Aug 20, 2004, at 7:54 AM, Rupinder Singh Mazara wrote: hi erik thanks for the warning and the code. Let me re-phrase the question, i have a index generated by lucene, i need to have the search capabilty to have a high availabilty. What solutions would be the most optimal I'm guessing from

Re: Debian build problem with 1.4.1

2004-08-20 Thread Erik Hatcher

On Aug 20, 2004, at 11:12 AM, Jeff Breidenbach wrote: Hi Otis, I'm asking, because it looks like your compiler is not finding Reader and IOException classes, both of which are in java.io.* package, which I see imported in StandardTokenizer.java as 'import java.io.*;'. In my copy of

Re: Debian build problem with 1.4.1

2004-08-20 Thread Erik Hatcher

On Aug 20, 2004, at 12:36 PM, Jeff Breidenbach wrote: I don't understand this. StandardTokenizer.java hasn't changed since last year. I have packaged Lucene such that 'ant javacc' is called at package build time. I now see the problem - 'import java.io.*;' has been removed from

Re: Custom filter

2004-08-20 Thread Erik Hatcher

Have you considered using the built-in QueryFilter for this? Why isn't it sufficient for your needs? Erik On Aug 20, 2004, at 6:32 PM, [EMAIL PROTECTED] wrote: Hi guys! I was hoping someone here could help me out with a custom filter. We have an index of emails and do some searches on

Re: Custom filter

2004-08-20 Thread Erik Hatcher

On Aug 20, 2004, at 6:48 PM, [EMAIL PROTECTED] wrote: We're currently in lucene 1.2... haven't moved to 1.3 yet. Skip 1.3 and go straight to 1.4.1 :) Upgrade - why not? Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For

Re: AnalyZer HELP Please

2004-08-18 Thread Erik Hatcher

? I'll leave that as a rhetorical question for now :) Erik Thx Karthik -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 17, 2004 7:35 PM To: Lucene Users List Subject: Re: AnalyZer HELP Please On Aug 17, 2004, at 9:47 AM, Karthik N S wrote: I did

Re: Restoring a corrupt index

2004-08-18 Thread Erik Hatcher

The details of the segments file (and all the others) is freely available here: http://jakarta.apache.org/lucene/docs/fileformats.html Also, there is Java code in Lucene, of course, that manipulates the segments file which could be leveraged (although probably package scoped and not

Re: AnalyZer HELP Please

2004-08-18 Thread Erik Hatcher

in a phrase query Tate p.s. Um... did you say that was a rhetorical question? ;-) -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Wednesday, August 18, 2004 6:17 AM To: Lucene Users List Subject: Re: AnalyZer HELP Please On Aug 18, 2004, at 3:41 AM, Karthik N S wrote: Hi

Re: AnalyZer HELP Please

2004-08-18 Thread Erik Hatcher

across that way before. Erik T -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Wednesday, August 18, 2004 2:00 PM To: Lucene Users List Subject: Re: AnalyZer HELP Please Thanks for doing the legwork. My favorite example is to be or not to be with and without

Re: What's the return order when the scores for two doc are exactly t he same

2004-08-18 Thread Erik Hatcher

The index order is the secondary sort order. You can change this by using the new sorting facility if desired. Erik On Aug 18, 2004, at 2:24 PM, Ching-Pei Hsing wrote: Hi, What is the order returned by Lucene when the scores for two result documents are exactly the same? I know this

Re: http AND halt

2004-08-17 Thread Erik Hatcher

What Analyzer is being used? If it is removing stop words, what is the stop word list? Erik On Aug 17, 2004, at 1:56 AM, Leos Literak wrote: One user reported, that if he searches http AND halt, the search fails. This can be found in logs: java.lang.ArrayIndexOutOfBoundsException: -1

Re: AnalyZer HELP Please

2004-08-17 Thread Erik Hatcher

This is what analyzers do. I don't know of any analyzer that deals with quotes in the way you're requesting, by keeping the contents together as a complete token. You'll have to write your own variant that does this. QueryParser, however, uses quotes to denote a phrase query, and will query

Re: AnalyZer HELP Please

2004-08-17 Thread Erik Hatcher

) should return me hits for the full word, but it did not. So when I did a quick run on Analyzer process and found that it was splitting the Word New Year = [New] [Year] Am I doing some thing wrong in here Thx in advance. Karthik -Original Message- From: Erik Hatcher

Re: AnalyZer HELP Please

2004-08-17 Thread Erik Hatcher

On Aug 17, 2004, at 9:23 AM, Karthik N S wrote: So when I did a quick run on Analyzer process and found that it was splitting the Word New Year = [New] [Year] Am I doing some thing wrong in here No... this is what this analyzer does. QueryParser does the same thing. The difference

Re: AnalyZer HELP Please

2004-08-17 Thread Erik Hatcher

On Aug 17, 2004, at 9:47 AM, Karthik N S wrote: I did as Erik replied in his mail , and searched for the complete word \New Year\ , but the QueryParser Still returns me hit for Year Only. [ The Analyzer I use has 555 English Stop words with new present in it ] No wonder! That's when I

Re: highlight the search word

2004-08-14 Thread Erik Hatcher

On Aug 14, 2004, at 7:10 AM, lingaraju wrote: How to highlight the search word See Highlighter here: http://jakarta.apache.org/lucene/docs/lucene-sandbox/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional

Re: Rename but not reindex

2004-08-13 Thread Erik Hatcher

You have to re-index. Updating is not currently possible, at least not without really low-level hacks. Erik On Aug 13, 2004, at 8:23 AM, Demetrio Zenti wrote: I apologise if it's a stupid question... I index Document objects having 2 fields: - 1° representing file name. It's code is

Re: Finding All?

2004-08-13 Thread Erik Hatcher

On Aug 13, 2004, at 4:01 PM, [EMAIL PROTECTED] wrote: A ranged query that covers the full range does the same thing. Of course it is also inefficient with term generation: myField[a TO z] Note that this won't work if you have more than 1024 matching terms, which is a quite likely scenario. The

Re: wildcard uppercase

2004-08-12 Thread Erik Hatcher

Query.toString() is your friend! As well as troubleshooting without QueryParser in the picture too. But, Daniel to the rescue :) Erik On Aug 12, 2004, at 5:06 PM, Otis Gospodnetic wrote: My guess would be 'something in the QueryParser', but I don't know for sure. Erik will know

Re: Searching without a specified field

2004-08-11 Thread Erik Hatcher

I suggest you aggregate all the text you want searchable into a single field during indexing. Then search that field at query time instead. The alternative is to build up a (potentially huge) BooleanQuery using that string for each field. The MultiFieldQueryParser can do this, but its not

OSCOM talks

2004-08-10 Thread Erik Hatcher

My Tapestry and Lucene talks have been accepted for the upcoming OSCOM conference in Zurich. http://www.oscom.org/events/oscom4/ I look forward to meeting some of the European Apache contingency! Erik - To

Re: Negative Boost

2004-08-04 Thread Erik Hatcher

On Aug 4, 2004, at 7:19 AM, Terry Steichen wrote: I can't get negative boosts to work with QueryParser. Is it possible to do so? Closer inspection on the parsing: Boost TOKEN : { NUMBER:(_NUM_CHAR)+ ( . (_NUM_CHAR)+ )? : DEFAULT } where #_NUM_CHAR: [0-9] So, no, negative boosts don't

Re: search exception in servlet!Please help me

2004-08-03 Thread Erik Hatcher

Where did you get 'i'? Keep in mind that using Hits.doc(n) intends 'n' to be a document *id*, not the iteration through the Hits collection. This is a very common mistake, and I'm guessing one you've made here. Erik On Aug 3, 2004, at 7:49 PM, xuemei li wrote: Thank you for your

Re: reverse lookup

2004-08-02 Thread Erik Hatcher

On Aug 1, 2004, at 10:25 PM, John Adam wrote: Is there a way to get most significant words of a document if i give a document number. Have a look at the term vector support new in v1.4. For a document number and field name, you get terms and frequencies: TermFreqVector vector =

Re: Proximity searching and phrase

2004-07-30 Thread Erik Hatcher

On Jul 30, 2004, at 7:01 AM, Lucene wrote: I was wondering is there is a way to do proximity searches with phrases eg very good NEAR sometimes. Any help on this would be welcome. You can do this with the new SpanQuery family in v1.4. The example you gave would consist of a SpanTermQuery for

Re: Phrase Query

2004-07-28 Thread Erik Hatcher

On Jul 27, 2004, at 11:42 AM, Hetan Shah wrote: Works for me. Here is what I am striving to achieve. phraseString = request.getParameter(phrase); if (phraseString.length() 0){ phraseQueryString = \+phraseString+(\); phraseQuery = true; queryString = phraseQueryString; }

Re: Progress bar for Lucene

2004-07-28 Thread Erik Hatcher

That'd be a pretty quick progress bar in the searches I've seen 10ms would be barely a blink of an eye. Perhaps we should discuss why your searches are slow enough to warrant a progress bar. But a HitCollector might be the right hook you're looking for. Erik On Jul 28, 2004, at

Re: Time of last insert

2004-07-27 Thread Erik Hatcher

On Jul 27, 2004, at 5:15 AM, Otis Gospodnetic wrote: There is no API for that. Yeah there is! :) IndexReader.lastModified() I borrowed that from LIMO's .jsp page, by the way. Erik - To unsubscribe, e-mail: [EMAIL

Re: Phrase Query

2004-07-26 Thread Erik Hatcher

Let's turn it around could you send us your code that is not working? Lucene's test cases show PhraseQuery in action, and working. Erik On Jul 26, 2004, at 4:11 PM, Hetan Shah wrote: Hello, Can someone on the mailing list send me a copy of sample code of how to implement the phrase

Re: Can I retrieve token offsets from Hits?

2004-07-21 Thread Erik Hatcher

On Jul 21, 2004, at 6:59 AM, Stepan Mik wrote: It is possible to retrieve tokens offsets (Token.startOffset(), Token.endOffset()) later when document is found and returned in hit collection? No offsets are not stored in the index. In fact, the only place they are currently used is with the

Re: Lucene vs. MySQL Full-Text

2004-07-21 Thread Erik Hatcher

Interestingly (and ironically) enough, the project I'm currently working on requires full-text searching of Word and PDF resumes. SQL Server is already the required database as well, so we are leveraging the full-text indexing capabilities it has. There is a special trick to drop a BLOB into

Re: Extracting Lucene onto Tomcat

2004-07-21 Thread Erik Hatcher

On Jul 21, 2004, at 8:10 AM, Ian McDonnell wrote: Is the package information and import paths ready to deploy on Tomcat server. I tried extracting lucene on the server, but when i compile files, it just throws numerous no class definition errors and errors relating to the package. Huh? Lucene

Re: Weighting database fields

2004-07-21 Thread Erik Hatcher

On Jul 21, 2004, at 10:09 AM, Anson Lau wrote: Apply boost factor to fields when you do a lucene search. Or... set the boost on the Field during indexing. Erik Anson -Original Message- From: John Patterson [mailto:[EMAIL PROTECTED] Sent: Thursday, July 22, 2004 12:07 AM To: [EMAIL

Re: Extracting Lucene onto Tomcat

2004-07-21 Thread Erik Hatcher

to compile any of the source it just throws numerous errors. I've got the classpath set to web-inf/classes. Have i extraced it to the wrong directory? --- Erik Hatcher [EMAIL PROTECTED] wrote: On Jul 21, 2004, at 8:10 AM, Ian McDonnell wrote: Is the package information and import paths ready to deploy

Re: Extracting Lucene onto Tomcat

2004-07-21 Thread Erik Hatcher

On Jul 21, 2004, at 11:19 AM, Ian McDonnell wrote: No sorry i didnt mean that i was trying to extract the jars at all. I meant the extraction of the original lucene source bundle. I have been developing in java for going on 5 years now, but am relatively new to Web Apps. I have some experience

Re: Weighting database fields

2004-07-21 Thread Erik Hatcher

On Jul 21, 2004, at 11:40 AM, Anson Lau wrote: Is there any benefit to set the boost during indexing rather than set it during query? It allows setting each document differently. For example, TheServerSide is using field-level boosts at index time to control ordering by date, such that newer

Re: The indexer

2004-07-20 Thread Erik Hatcher

On Jul 20, 2004, at 8:44 AM, Ian McDonnell wrote: Can Lucenes indexer be used to store info in fields in a mysql db? I'm not quite clear on your question. You want to store a Lucene index (aka Directory) within mysql? Or, you want to index data from your existing mysql database into a Lucene

Re: The indexer

2004-07-20 Thread Erik Hatcher

. Is this possible? Of course. But you'll have to code it. It's only a few lines of code to index a document into a Lucene index, but it is up to you to code those into the appropriate spot in your system (most likely right where you insert into mysql). Erik Ian --- Erik Hatcher [EMAIL PROTECTED

Re: The indexer

2004-07-20 Thread Erik Hatcher

On Jul 20, 2004, at 10:07 AM, Ian McDonnell wrote: As for indexing data from mysql - there have been lots of discussions of that recently, so check the archives. Basically you read the data, and index it with Lucene's API. And you are responsible for keeping it in sync. The problem i am having

Re: lucene cutomized indexing

2004-07-20 Thread Erik Hatcher

On Jul 20, 2004, at 12:12 PM, John Wang wrote: There are few things I want to do to be able to customize lucene: [...] 3) to be able to customize analyzers to add more information to the Token while doing tokenization. I have already provided my opinion on this one - I think it would be fine

Re: lucene cutomized indexing

2004-07-20 Thread Erik Hatcher

On Jul 20, 2004, at 2:10 PM, John Wang wrote: I have already provided my opinion on this one - I think it would be fine to allow Token to be public. I'll let others respond to the additional requests you've made. Great, what processes need to be in place before this gets in the code base? You're

Re: One Field!

2004-07-15 Thread Erik Hatcher

On Jul 14, 2004, at 10:19 PM, Jones G wrote: I have an index with multiple fields. Right now I am using MultiFieldQueryParser to search the fields. This means that if the same term occurs in multiple fields, it will be weighed accordingly. Is there any way to treat all the fields in question as

Re: Searching against Database

2004-07-15 Thread Erik Hatcher

In this situation, you may want to investigate implementing a custom Filter which is user-specific and constrains the search space to only the rows a specific user is allowed to search. Erik On Jul 15, 2004, at 3:04 AM, Sergiu Gordea wrote: Hi again, I'm thinking to get the list of

Re: Search +QueryParser+Score

2004-07-15 Thread Erik Hatcher

Kathik, I have a really hard time following your questions, otherwise I'd chime in on them more often. Your meaning is not often clear. In the case of normalizing the score to 1.0 or less - this is precisely what Hits does for you. I'm not sure what you mean by BEFORE doing QueryParser - a

Re: Search +QueryParser+Score

2004-07-15 Thread Erik Hatcher

... Apologies Let me be more Specific regarding the last mail I would like to get all Hits returned with score = 1.0 ONLY using Query Parser . What are my Options. with regards Karthik -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Thursday, July 15, 2004 4:45 PM

Re: Wildcard search with my own analyzer

2004-07-15 Thread Erik Hatcher

On Jul 15, 2004, at 10:02 AM, Morus Walter wrote: Joel Shellman writes: What do I need to do so that wildcard searching will work on this? I am using the same analyzer for indexing and searching (otherwise the first search wouldn't work either). Check what query is produced

Re: Searching for asterisk in a term

2004-07-07 Thread Erik Hatcher

On Jul 7, 2004, at 3:41 PM, [EMAIL PROTECTED] wrote: Can you recommend an analyzer that doesn't discard '*' or '/'? WhitespaceAnalyzer :) Check the wiki AnalysisParalysis page also. Erik - To unsubscribe, e-mail: [EMAIL

Re: PhraseQuery with Wildcards?

2004-07-07 Thread Erik Hatcher

On Jul 7, 2004, at 6:24 PM, [EMAIL PROTECTED] wrote: Hi, Is there any way to do a PhraseQuery with Wildcards? No. This very question came up a few days ago. Look at PhrasePrefixQuery - although this will be a bit of effort to expand the terms matching the wildcarded term. I'd like to search

Re: Latest StopAnalyzer.java

2004-07-06 Thread Erik Hatcher

On Jul 6, 2004, at 1:08 AM, Karthik N S wrote: Can SomeBody Tell me Where Can I find Latest copy of StopAnalyzer.java which can be used with Lucene1_4-final, On Lucene-Sandbox I am not able to Find it. [ My Company Prohibits me from using CVS ]

Re: indexing incrementally concurrently

2004-07-05 Thread Erik Hatcher

On Jul 5, 2004, at 9:00 AM, Michael Wechner wrote: If several users are saving documents on the server concurrently and during saving the index shall be updated incrementally ... do I have to make sure that it's going to be threadsave or does Lucene take care of this? Only a single IndexWriter

Re: question on setting boost factor

2004-07-01 Thread Erik Hatcher

On Jun 22, 2004, at 7:30 AM, Anson Lau wrote: Hi guys, Lets say I want to search the term hello world over 3 fields with different boost: ((hello:field1 world:field1)^0.001 (hello:field2 world:field2)^100 (hello:field3 world:field3)^2)) Note I've given field1 a really low boost, a heavy boost

Re: Building query to match a sub-string of a field

2004-07-01 Thread Erik Hatcher

On Jun 29, 2004, at 5:28 PM, Terence Lai wrote: Hi Everyone, I am trying to construct a query which matches a sub-string of a field. As an illustration, I would like to search the following words by using the sub-string test: - test - testing - contest - contestable I realize that Lucene does

Re: how to incorporate french stemmer in snowball?

2004-06-29 Thread Erik Hatcher

On Jun 29, 2004, at 4:23 AM, uddam chukmol wrote: I don't really know how to incorporate the french stemmer in to a snowball analyzer. Any sample code will be admired! Here is a short example from our Lucene in Action book's code: public void testSpanish() throws Exception { Analyzer

Re: PhraseQuery

2004-06-28 Thread Erik Hatcher

On Jun 28, 2004, at 2:59 PM, Hetan Shah wrote: I was wondering if anyone out there had tried the PhraseQuery class and retrieved the results. I'm new to the whole search solution. I have a need to do a exact phrase search. Any code sample would be really appreciated. PhraseQuery query =

Re: Storing dates as millis

2004-06-28 Thread Erik Hatcher

On Jun 28, 2004, at 3:05 PM, Kevin Burton wrote: Just trying to think about the most efficient way to do this. This seems to need a wiki page entry :) ya mean like this? :) http://wiki.apache.org/jakarta-lucene/IndexingDateFields feel free to improve it.

Re: Storing dates as longs

2004-06-28 Thread Erik Hatcher

On Jun 28, 2004, at 11:17 AM, Kevin Burton wrote: Otis Gospodnetic wrote: Hello, The standard answer to this question is: If you don't need your dates to be very precise, trim milliseconds. Trim more (e.g. seconds) if that information is not relevant. So the question is should use store this

Re: QueryParser and Keyword Fields

2004-06-25 Thread Erik Hatcher

On Jun 25, 2004, at 1:41 PM, [EMAIL PROTECTED] wrote: Can anyone give me advice on the best way to not have your keyword fields analyzed by QueryParser? Even though it seems like it would be a common problem, I have read the FAQ, and found this relevant thread with no real answers.

Re: Compiling Lucene on PPC linux

2004-06-24 Thread Erik Hatcher

What is the rmic error that you're getting? I definitely recommend taking this to the ant-user list, but I know there have been some IBM VM issues with Lucene in the past and I'm not sure where they stand now. Erik On Jun 24, 2004, at 7:46 AM, Brian Lee Yung Rowe wrote: Hello, Has

Re: field indexed but not stored

2004-06-24 Thread Erik Hatcher

On Jun 24, 2004, at 2:10 PM, Ryan Sonnek wrote: I'm using lucene-1.4-rc3 and trying to optimize the size of our index and decrease search times. our index has several fields that we need to search and sort by, but only one field that we actually retrieve from the hits document. I tried

JavaOne and Lucene

2004-06-23 Thread Erik Hatcher

I'm presenting Lucene in Action Tuesday morning next week at JavaOne (TS-2994). Any other Luceners going to JavaOne? Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Question on how to build a query

2004-06-19 Thread Erik Hatcher

On Jun 19, 2004, at 1:51 AM, Jason St. Louis wrote: I wrote my indexer so that it added each field without tokenizing it: Field fnameField = new Field(fname, fname.toLowerCase(), true, true, false); Field lnameField = new Field(lname, lname.toLowerCase(), true, true, false); Field cityField =

Re: amusing interaction between advanced tokenizers and highlighter package

2004-06-19 Thread Erik Hatcher

On Jun 19, 2004, at 2:29 AM, David Spencer wrote: A naive analyzer would turn something like SyncThreadPool into one token. Mine uses the great Lucene capability of Tokens being able to have a 0 position increment to turn it into the token stream: Sync (incr = 0) Thread (incr = 0) Pool (incr

Re: Lucene index causing Too Many Open Files

2004-06-17 Thread Erik Hatcher

Upgrade to Lucene 1.3 (or greater) and enable the compound file format when you index. This should alleviate this issue. Erik On Jun 17, 2004, at 10:31 AM, Dan Aloma wrote: We've got a lucene search index that we use to search (and delete) a couple thousand items. It runs fine for

Re: Simultaneous Search

2004-06-17 Thread Erik Hatcher

On Jun 17, 2004, at 11:23 AM, Don Vaillancourt wrote: How does Lucene handle simultaneous search requests? Can it or is it up to the programmer to make sure that only one search occurs at a time? Lucene handles concurrent searches, no problem. No need for the programmer to ensure only one

Re: Transform a sentences in Terms

2004-06-17 Thread Erik Hatcher

On Jun 17, 2004, at 11:25 AM, Daniel de Souza Teixeira wrote: Hi folks! I needed something like this... I have a sentence: I love my dog. Someone has a code that I can transform this sentence in Terms to add in a PhraseQuery ? PhraseQuery pq = new PhraseQuery(); while ( sentence doesn't ended )

Re: Simultaneous Search

2004-06-17 Thread Erik Hatcher

On Jun 17, 2004, at 1:25 PM, Don Vaillancourt wrote: But does Lucene handle updating a collection while it is being searched. Yes. An IndexSearcher has a view of the index at the time it was instantiated, so dump that instance and re-instantiate to search the new documents. Erik

Re: RuntimeException: cannot determine sort type!

2004-06-16 Thread Erik Hatcher

On Jun 16, 2004, at 5:33 AM, [EMAIL PROTECTED] wrote: Are you sure every document has a single modified indexed term? What do You call single? It's just one field, defined as keyword, but it content can be the same, because it's a timestamp. Every doc has it, this I garantee. Single means a

Re: RuntimeException: cannot determine sort type!

2004-06-16 Thread Erik Hatcher

On Jun 16, 2004, at 7:25 AM, [EMAIL PROTECTED] wrote: Well, I just didn't want to overload people with too much code. There is an art to providing just enough detail :) doc is created like this (modified get formated with SimpleDateFormat tformat = new SimpleDateFormat (MMddhhmmss) by

Re: How to use PhraseQuery Class?

2004-06-16 Thread Erik Hatcher

On Jun 16, 2004, at 8:45 AM, Daniel de Souza Teixeira wrote: Do I need to use some special kind of Analyzer on the index method ? What analyzer did you use? SimpleAnalyzer lowercases PhraseQuery query = new PhraseQuery(); query.setSlop(2); query.add(new Term(contents, How)); query.add(new

Re: RuntimeException: cannot determine sort type!

2004-06-16 Thread Erik Hatcher

On Jun 16, 2004, at 12:09 PM, [EMAIL PROTECTED] wrote: thank You very much, I tried it and it looked like it really fixed the problem. *whew* If modified is only there for sorting and not for querying, perhaps index it as a Integer.toString or Float.toString instead - this will give you better

Re: Q: how to parse query and add boost a term

2004-06-15 Thread Erik Hatcher

On Jun 15, 2004, at 5:04 AM, Zilverline wrote: Hi, How can I use Lucene's API best to parse a query and then boost the query with the 'title' given the relevant parts of the query (the default Field)? To give you an example: I want to change: 'java method invocation +type:HTML' into

Re: How to use PhraseQuery Class?

2004-06-15 Thread Erik Hatcher

On Jun 15, 2004, at 11:21 AM, Daniel de Souza Teixeira wrote: Hi! I need some information about PhraseQuery Class. How to use this class ? Are you using QueryParser to create the query? If so, have a look at the query parser syntax page in the docs (or the Lucene website) - basically just

Re: date range query problem(Help me)

2004-06-14 Thread Erik Hatcher

If you'd provide a succinct JUnit test case (using RAMDirectory and hard-coded values being indexed) I'd be happy to have a look. As it is, this is too convoluted for me to follow. Erik On Jun 14, 2004, at 12:40 AM, Sumit Mishra wrote: Hi, My requirement to fetch the result with

Re: Open-ended range queries

2004-06-11 Thread Erik Hatcher

On Jun 10, 2004, at 10:37 PM, Terry Steichen wrote: Speaking for myself, only a small number of my code modules currently treat null as the open-ended range query term parameter. If the syntax change from 'null' -- '*' was deemed otherwise desirable and the syntax transition made very clearly,

Re: RE : Analyzers

2004-06-11 Thread Erik Hatcher

On Jun 11, 2004, at 5:30 AM, Rasik Pandey wrote: A CustomAnalyzer in which Tokenizers, TokenFilters (StopFilters, StemFilters, etc.) could be dynamically set at runtime for creating a TokenStream would be nice as well. Has anyone done any research along these lines with respect to the Lucene

Re: hit score in 1.3 vs 1.4

2004-06-11 Thread Erik Hatcher

On Jun 11, 2004, at 5:51 AM, Stefan Groschupf wrote: Hi, I'm having a strange problem until upgrading lucene 1.3 to 1.4 rc4. I'm using a third party component that include the old lucene 1.3 but i need to run the new 1.4 rc 4 in the same vm. So i unpack the component jar, remove all lucene 1.3

Re: date range query problem

2004-06-11 Thread Erik Hatcher

On Jun 11, 2004, at 8:00 AM, Sumit Mishra wrote: Hi, My requirement to fetch the result with in the date range. I am filtering the query like to retrieve the date wichi fall between these two year.. bqr.add(QueryParser.parse ([ + 1978 + TO +2000+],fullhead,new

Re: extensible query parser - Re: Proximity Searches behavior

2004-06-10 Thread Erik Hatcher

On Jun 9, 2004, at 4:39 PM, David Spencer wrote: I like the idea of a flexible run-time grammar, but it sounds too good to be true in a general purpose kinda way. My idea isn't perfect for humans, but at least lets you use queries not hard coded. But in my idealistic view, getting something

Re: extensible query parser - Re: Proximity Searches behavior

2004-06-10 Thread Erik Hatcher

On Jun 10, 2004, at 10:26 AM, Terry Steichen wrote: Erik, When is Lucene in Action scheduled to be out? To add to what Otis said - I've been working feverishly to come to terms with SpanQuery*, the new sorting feature, Highlighter, Nutch analysis and much more. Lucene in Action will be accurate

Re: Supported Languages

2004-06-10 Thread Erik Hatcher

On Jun 10, 2004, at 11:37 AM, Don Vaillancourt wrote: I've noticed from the documentation that Russian and German languages are supported by Lucene, but does Lucene support the french language. Look in Lucene's sandbox for analyzers for all sorts of languages. What is the definition of support in

Re: Another way to handle large numeric range queries

2004-06-10 Thread Erik Hatcher

On Jun 9, 2004, at 1:05 PM, Don Gilbert wrote: I'm particularly interested in the XPath stuff I saw in LGQueryParser. * xpathFieldParse 'xpath' parser: param allfields[], with query or field[] possibly having wild-card notation: *.start annotation.*.text allowing '/' and '.' field

Re: Open-ended range queries

2004-06-10 Thread Erik Hatcher

On Jun 10, 2004, at 4:07 PM, Terry Steichen wrote: Well, I'm using 1.4 RC3 and the null range upper limit works just fine for searches in two of my fields; one is in the form of a cannonical date (eg, 20040610) and the other is in the form of a padded word count (e.g., 01500 for 1500). The

Re: Another way to handle large numeric range queries

2004-06-09 Thread Erik Hatcher

On Jun 8, 2004, at 10:55 PM, Don Gilbert wrote: Find this as part of the 'LuceGene' package for searching genome and bioinformatics databases at http://www.gmod.org/lucegene/ with lucene related source code in cvs here: Nice stuff!

Re: Proximity Searches behavior

2004-06-09 Thread Erik Hatcher

On Jun 9, 2004, at 4:26 AM, gaudinat wrote: What does exactly happen with three words or more when we do a proximity search? such as: lucene jakarta best~10 Is each word can be at a distance of 10 of each others, or is there an other behaviour? The total number of hops to put the words in

Re: Proximity Searches behavior

2004-06-09 Thread Erik Hatcher

On Jun 9, 2004, at 8:53 AM, Terry Steichen wrote: 1) If you set the default slop factor in QueryProcessor to something greater than 1, can you also use wildcards? (I ask that question because, to my understanding, you can't combine the explicit proximity query syntax with wildcards. That is,

Re: Proximity Searches behavior

2004-06-09 Thread Erik Hatcher

On Jun 9, 2004, at 8:53 AM, Terry Steichen wrote: 3) Is there a plan for adding QueryParser support for the SpanQuery family? Another important facet to Terry's question here is what syntax to use to express all various types of queries? I suspect that Google stats show us that most folks

Re: Proximity Searches behavior

2004-06-09 Thread Erik Hatcher

On Jun 9, 2004, at 12:21 PM, David Spencer wrote: show us that most folks query with 1 - 3 words and do not use the any of the advanced features. But with automagic query expansion these things might be done behind the scenes. Nutch, for one, expands simple queries to check against multiple

Re: phrase query not working in boolean clause

2004-06-09 Thread Erik Hatcher

On Jun 9, 2004, at 12:25 PM, Michael Duval wrote: When doing an exact phrase query on the title the expected results are returned: +(title:Mass Asymmetry) after tokenizing/filtering: +title:mass asymmetri returns 20 Hits example hit: Mass asymmetry, equation of state, and nuclear

Re: score and frequency

2004-06-05 Thread Erik Hatcher

On Jun 5, 2004, at 1:13 AM, Niraj Alok wrote: I want all the titles which have both ice and hockey to come above the rest (to have higher scores) Meaning i would wish the results to appear like: ice hockey ice hockey ice hockey winter Olympics: hockey, ice, medallists ice hockey: British Sekonda

Re: using previous results on a new search

2004-06-04 Thread Erik Hatcher

On Jun 4, 2004, at 3:07 AM, Antoine Brun wrote: We are investigating the possibility to insert previous search results to a new query. Does anyone knows if it is possible or if such an evolution is under development I suppose you mean search within search, so that the second search is

Re: Author or SearchBean

2004-06-04 Thread Erik Hatcher

SearchBean should be discussed on this list - no need to contact the original developer directly (in fact, it's a better practice to discuss open source code in the appropriate public forums). Erik On Jun 4, 2004, at 5:56 AM, [EMAIL PROTECTED] wrote: Hi! Where can I get the mail address

Re: why the score is not 1.0?

2004-06-03 Thread Erik Hatcher

Without looking at your code, a good first suggestion is to se IndexSearcher.explain(Query,docId) to see why scores are they way they are. Erik On Jun 3, 2004, at 7:21 AM, uddam chukmol wrote: Dear all, I have another trouble in one of my program using Lucene. I tried to compare the

Re: Writing a stemmer

2004-06-03 Thread Erik Hatcher

On Jun 3, 2004, at 4:09 PM, Musku, Anil (LA) wrote: Can anyone provide some help on writing a stemmer for non-english languages? Have a look at the snowball project in the Lucene sandbox. If its non-European-based languages, I suspect it's quite complex. It's highly language dependent. How

Re: a list of matching search term

2004-06-02 Thread Erik Hatcher

On Jun 1, 2004, at 9:19 PM, Anson Lau wrote: Further to my previous email: The highlighter package should be able to pick up the matching search terms. Can some experienced highlighter package users tell me if I should look down that line? Yes, Highlighter (available in the sandbox) picks out

Re: Range Query Sombody HELP please

2004-06-02 Thread Erik Hatcher

On Jun 2, 2004, at 6:20 AM, Karthik N S wrote: Hey Ype/Erick If you're gonna ask for help, the least ya could do is spell my name correctly :) I still have 3 small Questions. 1)While creating the Range Query Is it possible for Lucene to do somthing similar.. +(button AND shirt)

Re: similarity of two texts

2004-06-02 Thread Erik Hatcher

On Jun 2, 2004, at 1:39 PM, David Spencer wrote: Erik, Could you expand on this just a wee bit, perhaps with an example of how to compute this vector angle? I'm tempted to write the code to see how it works, but FYI this doc seems to nicely explain the concepts:

Re: similarity of two texts

2004-06-01 Thread Erik Hatcher

On May 31, 2004, at 2:17 PM, Stefan Groschupf wrote: Lucene can't help you. What about using term vectors though? I've been able to do rudimentary document similarity calculations using the new support in Lucene 1.4. Search the 'net for more info on term vectors and the formulas needed

Re: similarity of two texts

2004-06-01 Thread Erik Hatcher

On Jun 1, 2004, at 6:06 AM, [EMAIL PROTECTED] wrote: Zitiere Erik Hatcher [EMAIL PROTECTED]: On May 31, 2004, at 2:17 PM, Stefan Groschupf wrote: Lucene can't help you. What about using term vectors though? I've been able to do rudimentary document similarity calculations using the new support

< 1 2 3 4 5 6 7 8 9 >

301 - 400 of 800 matches

Mail list logo