RE: OutOfMemory example

2004-09-14 Thread Ji Kuhn
The error is thrown in exactly the same point as before. This morning I downloaded Lucene from CVS, now the jar is lucene-1.5-rc1-dev.jar, JVM is 1.4.2_05-b04, both Linux and Windows. Jiri. -Original Message- From: Daniel Naber [mailto:[EMAIL PROTECTED] Sent: Monday, September 13, 2004

Re: OutOfMemory example

2004-09-14 Thread Daniel Naber
On Tuesday 14 September 2004 08:32, Ji Kuhn wrote: The error is thrown in exactly the same point as before. This morning I downloaded Lucene from CVS, now the jar is lucene-1.5-rc1-dev.jar, JVM is 1.4.2_05-b04, both Linux and Windows. Now I can reproduce the problem. I first tried running the

Search PharseQuery

2004-09-14 Thread Natarajan.T
Hi All, How do I implement PharseQuery API? Pls send me some sample code.( How can I handle java is platform as single word? ) Regards, Natarajan.

RE: Search PharseQuery

2004-09-14 Thread Cocula Remi
Use QueryParser. please take a look at http://today.java.net/pub/a/today/2003/11/07/QueryParserRules.html It's pretty clear. -Message d'origine- De : Natarajan.T [mailto:[EMAIL PROTECTED] Envoyé : mardi 14 septembre 2004 11:26 À : 'Lucene Users List' Objet : Search PharseQuery Hi

Re: Search PharseQuery

2004-09-14 Thread sergiu gordea
String queryString = \waht is java\; Query q = QueryParser.parse(queryString, field, new StandardAnalyzer()); System.out.println(q.toString()); This is enough for starting consult Lucene API for more information Sergiu Natarajan.T wrote: Hi, Thanks for your mail, that link says only

RE: Search PharseQuery

2004-09-14 Thread Natarajan.T
Hi Serigu, String queryString = \waht is java\; Query q = QueryParser.parse(queryString, field, new StandardAnalyzer()); System.out.println(q.toString()); This is enough for starting consult Lucene API for more information Are you tested the above query? This search keyword is not a

RE: Search PharseQuery

2004-09-14 Thread Honey George
--- Natarajan.T [EMAIL PROTECTED] wrote: I am trying to extend the current behavior. You might have already seen a mail from Cocula Remi on this. Please provide more details of the problem for specific comments - basically the problem you are facing and/or what behavior you are trying to

Document Relevance

2004-09-14 Thread ebrahim . faisal
Hi I am new to Lucene. Could anyone tell me how to set the RELEVANCE in which the search results are displayed. Any online Examples available on this topic I welcome ur suggestions Thanx Regards E.Faisal Important Email Information :- The information in this email is confidential and may

RE: Search PharseQuery

2004-09-14 Thread Natarajan.T
Hi, Thanks for your response. For example search keyword is like below... Language what is java Token 1: language Token 2: what is java(like google) Regards, Natarajan. -Original Message- From: Aad Nales [mailto:[EMAIL PROTECTED] Sent: Tuesday, September 14, 2004 5:19 PM

Indexing object graphs

2004-09-14 Thread Erik Hatcher
Interesting! http://kasparov.skife.org/blog/2004/09/13#lucene-graphs - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Search PharseQuery

2004-09-14 Thread sergiu gordea
Natarajan.T wrote: Hi, Thanks for your response. For example search keyword is like below... Language what is java Token 1: language Token 2: what is java(like google) Regards, Natarajan. Lucene works exaclty as you describe above with a simple correction ... The analyzer has a list of

Re: Addition to contributions page

2004-09-14 Thread Erik Hatcher
Perhaps we should @deprecate the contributions page like we did with the Powered By page, and migrate it to the wiki? Erik On Sep 13, 2004, at 6:50 PM, Daniel Naber wrote: On Friday 10 September 2004 15:48, Chas Emerick wrote: PDFTextStream should be added to the 'Document Converters'

RE: Search PharseQuery

2004-09-14 Thread Natarajan.T
Ok you are correct ... Suppose if I type what java then how can I handle... Regards, Natarajan. -Original Message- From: sergiu gordea [mailto:[EMAIL PROTECTED] Sent: Tuesday, September 14, 2004 7:38 PM To: Lucene Users List Subject: Re: Search PharseQuery Natarajan.T wrote: Hi,

RE: ANT +BUILD + LUCENE

2004-09-14 Thread Karthik N S
Hi Erik 1) Using Ant and Build.xml I want to run the org.apache.lucene.demo.IndexFiles to create an Indexfolder 2) Problem is The same Build.xml is to be used Across the O/s for creating Index 3) The path of Lucene1-4-final.jar are in respective directories for the O/s...

Re: Search PharseQuery

2004-09-14 Thread sergiu gordea
Natarajan.T wrote: Ok you are correct ... Suppose if I type what java then how can I handle... You don't have to handle it, lucene does it. If you don't like how lucene handles it then you may extend the functionality. If you use the same analyzer for indexing and searching then you will

Re: ANT +BUILD + LUCENE

2004-09-14 Thread Erik Hatcher
Karthik, You are still being a bit cryptic and making it hard for me to comprehend what the problem is, but here are some general pieces of advice with Ant related to what I think you are doing: * There is no need to use conditional logic to have a different set of properties for different

Help for text based indexing

2004-09-14 Thread mahaveer jain
Hi I have implemented Text based search using lucene. I was wonderful playing around with it. Now I want to enchance the application. I have a Root folder, under that I have many other folder, that are group specific, say (group1, group2, .. so on). The Root folder is in

RE: Help for text based indexing

2004-09-14 Thread Cocula Remi
You just have to loop recurssively over the C:\tomcat\webapps\Root tree to create your index. Yes you can index databases; you will just have to write a mechanism that is able to create org.apache.lucene.document.Document from database. For instance : - connect JDBC - run a query for obtaining

RE: Help for text based indexing

2004-09-14 Thread mahaveer jain
I am clear with looping recursively to index all the file under Root folder. But the problem is if I want to search only in group1 or group2.Is that possible to search only in one of the group folder ? Cocula Remi [EMAIL PROTECTED] wrote: You just have to loop recurssively over the

RE: Help for text based indexing

2004-09-14 Thread Cocula Remi
Well you could add a field to each of your Documents whose value would be either group1 or group2. Or you could use the path to your files ... -Message d'origine- De : mahaveer jain [mailto:[EMAIL PROTECTED] Envoyé : mardi 14 septembre 2004 17:49 À : [EMAIL PROTECTED] Objet : RE: Help

RE: Help for text based indexing

2004-09-14 Thread mahaveer jain
Well in my case the path is KeyWord. I had tried that earlier and it does not seems to work in a single index file. Can you explain a bit more about adding group1 and group2 ? Cocula Remi [EMAIL PROTECTED] wrote: Well you could add a field to each of your Documents whose value would be

RE: ANT +BUILD + LUCENE

2004-09-14 Thread Gerard Sychay
Hi, I've used the following Ant targets for build scripts that required platform dependent work. In the example here, the property catalina.home is set according to what platform we're running on. You can adapt as needed. target name=platform description=Sets properties based on platform

RE: Help for text based indexing

2004-09-14 Thread Cocula Remi
A keyword is not tokenized, that's why you wont be able to search over a part of it. You'd rather use a Text fied. About creating a special field : IndexWriter Ir = File f = Document doc = new Document(); if (f.toString.startsWith(C:\tomcat\webapps\Root\Group1) {

RE: Help for text based indexing

2004-09-14 Thread mahaveer jain
If i have rightly understood, you mean to say that the query for search has to be Group1 AND Hello (if hello is what I want to search ?) Cocula Remi [EMAIL PROTECTED] wrote: A keyword is not tokenized, that's why you wont be able to search over a part of it. You'd rather use a Text fied.

PorterStemfilter

2004-09-14 Thread Honey George
Hi, This might be more of a questing related to the PorterStemmer algorithm rather than with lucene, but if anyone has the knowledge please share. I am using the PorterStemFilter that some with lucene and it turns out that searching for the word 'printer' does not return a document containing

RE: Help for text based indexing

2004-09-14 Thread Honey George
You could recieve the group name as an input from the user and construct a BooleanQuery internally which will qyery only the group field based on the user input. So the user need not append the group name with the search string. Thanks, George --- mahaveer jain [EMAIL PROTECTED] wrote: If i

Re: PorterStemfilter

2004-09-14 Thread David Spencer
Honey George wrote: Hi, This might be more of a questing related to the PorterStemmer algorithm rather than with lucene, but if anyone has the knowledge please share. You might want to also try the Snowball stemmer: http://jakarta.apache.org/lucene/docs/lucene-sandbox/snowball/ And KStem:

Re: PorterStemfilter

2004-09-14 Thread Pete Lewis
Hi George There are lots of problems with Port stemmers, not great for English but get worse for other languages. If you look at: http://snowball.tartarus.org/demo.php You'll see the Snowball demo - this is basically another instance of Porter. If you enter print and printer and submit then

Re: PorterStemfilter

2004-09-14 Thread Pete Lewis
Hi David I like KStem more than Porter / Snowball - but still has limitations although performs better as it has a dictionary to augment the rules. Note that KStem will also treat print and printer as two distinct terms, probably treating it as verb and noun respectively. Cheers Pete Lewis

NGramSpeller contribution -- Re: combining open office spellchecker with Lucene

2004-09-14 Thread David Spencer
Andrzej Bialecki wrote: David Spencer wrote: I can/should send the code out. The logic is that for any terms in a query that have zero matches, go thru all the terms(!) and calculate the Levenshtein string distance, and return the best matches. A more intelligent way of doing this is to instead

Re: NGramSpeller contribution -- Re: combining open office spellchecker with Lucene

2004-09-14 Thread David Spencer
Tate Avery wrote: I get a NullPointerException shown (via Apache) when I try to access http://www.searchmorph.com/kat/spell.jsp How embarassing! Sorry! Fixed! T -Original Message- From: David Spencer [mailto:[EMAIL PROTECTED] Sent: Tuesday, September 14, 2004 3:23 PM To: Lucene Users

Re: NGramSpeller contribution -- Re: combining open office spellchecker with Lucene

2004-09-14 Thread Andrzej Bialecki
David Spencer wrote: ...or prepare in advance a fast lookup index - split all existing terms to bi- or trigrams, create a separate lookup index, and then simply for each term ask a phrase query (phrase = all n-grams from an input term), with a slop 0, to get similar existing terms. This should be

Re: NGramSpeller contribution -- Re: combining open office spellchecker with Lucene

2004-09-14 Thread David Spencer
Andrzej Bialecki wrote: David Spencer wrote: ...or prepare in advance a fast lookup index - split all existing terms to bi- or trigrams, create a separate lookup index, and then simply for each term ask a phrase query (phrase = all n-grams from an input term), with a slop 0, to get similar

Re: NGramSpeller contribution -- Re: combining open office spellchecker with Lucene

2004-09-14 Thread Doug Cutting
Andrzej Bialecki wrote: I was wondering about the way you build the n-gram queries. You basically don't care about their position in the input term. Originally I thought about using PhraseQuery with a slop - however, after checking the source of PhraseQuery I realized that this probably

Re: frequent terms - Re: combining open office spellchecker with Lucene

2004-09-14 Thread Doug Cutting
David Spencer wrote: [1] The user enters a query like: recursize descent parser [2] The search code parses this and sees that the 1st word is not a term in the index, but the next 2 are. So it ignores the last 2 terms (recursive and descent) and suggests alternatives to recursize...thus if

Hits.doc(x) and range queries

2004-09-14 Thread roy-lucene-user
Hi guys! I've posted previously that Hits.doc(x) was taking a long time. Turns out it has to do with a date range in our query. We usually do date ranges like this: Date:[(lucene date field) - (lucene date field)] Sometimes the begin date is 0 which is what we get from

Re: frequent terms - Re: combining open office spellchecker with Lucene

2004-09-14 Thread David Spencer
Doug Cutting wrote: David Spencer wrote: [1] The user enters a query like: recursize descent parser [2] The search code parses this and sees that the 1st word is not a term in the index, but the next 2 are. So it ignores the last 2 terms (recursive and descent) and suggests alternatives to

Re: PorterStemfilter

2004-09-14 Thread Tea Yu
David, For me I don't want a search for in print gives results from in printer? I'll consider that over-stemmed elsecase. I'm also not that satisfactory when effective is stemmed to effect by snowball recently Cheers Tea Hi David I like KStem more than Porter / Snowball - but still has

Re: PorterStemfilter

2004-09-14 Thread Honey George
--- Tea Yu [EMAIL PROTECTED] wrote: David, For me I don't want a search for in print gives results from in printer? I'll consider that over-stemmed elsecase. Here the in won't be considered as it is a stopword in most of the analyzers. I know it is in StandardAnalyzer. So searching for 'in