Re: Index Rows as Documents? Help me design a solution

2006-07-25 Thread Daniel Naber
On Dienstag 25 Juli 2006 04:05, Namit Yadav wrote: 1 List SMSIDs of all the SMSes that a phone number had sent (Each SMS message will have a globally unique ID) 2 List SomeData1, SomeData2, SomeData3 and SomeData4 for a given SMSID. How can I do this efficiently? Short answer: use a

Grouping over multiple fields

2006-07-25 Thread Krishnendra Nandi
Hi All, Can anybody help me out on this ..? I have to search for a particular value over multiple fields and need to know if grouping is allowed over multiple fields eg. some queryAND ( AUTHOR_NAME:krish OR EMPLOYEE_NAME:krish ) Introducing paranthesis ( is giving me lexical error

Re: Grouping over multiple fields

2006-07-25 Thread Miles Barr
Krishnendra Nandi wrote: Can anybody help me out on this ..? I have to search for a particular value over multiple fields and need to know if grouping is allowed over multiple fields eg. some queryAND ( AUTHOR_NAME:krish OR EMPLOYEE_NAME:krish ) Introducing paranthesis ( is giving

Re: dash-words

2006-07-25 Thread karl wettin
On Mon, 2006-07-24 at 21:16 -0400, Yonik Seeley wrote: I can't figure out what the parameters does. ;) Hopefully the wiki link I gave before will explain the parameters. Oh, I so totally missed that. Do you want me to java-doc it up and send you the patch?

IndexReader / IndexWriter Synchronization

2006-07-25 Thread vasu shah
Hi, I went through the IndexModifier class. It says that - Although an instance of this class can be used from more than one thread, you will not get the best performance. You might want to use IndexReader and IndexWriter directly for that (but you will need to care about synchronization

Re: Index Rows as Documents? Help me design a solution

2006-07-25 Thread Erick Erickson
Indexing 1M of logs shouldn't take minutes, so you're probably right. A problem I've seen is opening/indexing/closing your index writer too often. You should do something like... (really bad pseudo code here) IndexWriter IW = new IndexWriter(); for (lots and lots and lots of records) {

Limit number of search results

2006-07-25 Thread headhunter
Hello, I am looking for a way to limit the number of search results I retrieve when searching. I am only interested in (let's say) the first ten hits of a query.. maybe I want to look at hits ten..twenty to, but usually only the first results are important. Right now lucene searches through

Re: dash-words

2006-07-25 Thread Yonik Seeley
On 7/25/06, karl wettin [EMAIL PROTECTED] wrote: On Mon, 2006-07-24 at 21:16 -0400, Yonik Seeley wrote: I can't figure out what the parameters does. ;) Hopefully the wiki link I gave before will explain the parameters. Oh, I so totally missed that. Do you want me to java-doc it up and

Re: Limit number of search results

2006-07-25 Thread Miles Barr
headhunter wrote: I am looking for a way to limit the number of search results I retrieve when searching. I am only interested in (let's say) the first ten hits of a query.. maybe I want to look at hits ten..twenty to, but usually only the first results are important. Right now lucene

Re: dash-words

2006-07-25 Thread Martin Braun
Hi Yonik, I can't figure out what the parameters does. ;) Yes, it will fail without slop... I don't think there is a practical way around that. I am trying to analyze your WordDelimiterFilter. If I have x-men, after analyzing (with catenateAll) I get this: Analzying The x-men story

Re: IndexReader / IndexWriter Synchronization

2006-07-25 Thread Michael McCandless
If I use IndexReader and IndexWriter class for inserts/updates, then I need to handle the threading issues myself. Is there any other class (even in nightly build) that I can use without having to take care of synchronization. All this means is your code must ensure only one writer

Article keyword counters

2006-07-25 Thread laszlo sera
Hi all, I need some help from the Lucene experts because I coulnd't find the best solution for a problem... The problem: we have article entities which can have multiple keywords: - article #1: keyword #1, keyword#2, keyword#3 - article #2: keyword#2, keyword#3 - article #3: keyword#3 -

Re: dash-words

2006-07-25 Thread Yonik Seeley
On 7/25/06, Martin Braun [EMAIL PROTECTED] wrote: Hi Yonik, I can't figure out what the parameters does. ;) Yes, it will fail without slop... I don't think there is a practical way around that. I am trying to analyze your WordDelimiterFilter. If I have x-men, after analyzing (with

Re: IndexReader / IndexWriter Synchronization

2006-07-25 Thread vasu shah
Thanks Mike for the reply. I will look into Lucene in Action. I am not very good at threading. So I was looking if there is any api class (even in nightly builds) on top of the IndexReader/IndexWriter that takes care of concurrency rules. Every developer must be facing this problem

Re: IndexReader / IndexWriter Synchronization

2006-07-25 Thread Michael McCandless
I am not very good at threading. So I was looking if there is any api class (even in nightly builds) on top of the IndexReader/IndexWriter that takes care of concurrency rules. This is exactly why IndexModifier was created (so you wouldn't have to worry about the details of closing/opening

Copying documents

2006-07-25 Thread Mike Streeton
I want to copy a selection of documents from one index to another. I can get the Document objects from the IndexReader and write them to the target index using the IndexWriter. The problem I have is this loses fields that have not been stored, is there a way round this. Thanks Mike

Re: IndexReader / IndexWriter Synchronization

2006-07-25 Thread vasu shah
Thanks Mike. Your explanation was really helpful. I would use the IndexModifier class till the new IndexWriter class comes up. Thanks once again. -Vasu Michael McCandless [EMAIL PROTECTED] wrote: I am not very good at threading. So I was looking if there is any api class (even

Re: MultiFieldQueryParser.parse deprecated. What can I use?

2006-07-25 Thread Doron Cohen
(Seems 1.9 javadoc could be just a bit more clear on this.) The following should do the work: QueryParser qp = new MultiFieldQueryParser(fields[], analyzer); Query = qp.parse(qtext); Notice the difference in semantics as explained in the deprecated comment in 1.9. Also see the

Re: Matching accented with non-accented characters

2006-07-25 Thread Steven Rowe
Rajan, Renuka wrote: I am trying to match accented characters with non-accented characters in French/Spanish and other Western European languages. The use case is that the users may type letters without accents in error and we still want to be able to retrieve valid matches. The one idea,

Sorting with Parallelreader fails

2006-07-25 Thread neils
Hi, i have 3 indexfiles which i access over a parallelreader. When i make a search, everything works fine, butwhen i want to make a search and sorting by a special column i get an error. Here is my code: Schnipp Dim field As SortField = New SortField(Streetname) Dim sortByName As

Re: Grouping over multiple fields

2006-07-25 Thread Doron Cohen
Just realized that the some text part should also be grouped, so checked that this variation also works: qtxt = some text AND ( AUTHOR_NAME:krish OR EMPLOYEE_NAME:krish ); --- field:some +field:text +(AUTHOR_NAME:krish EMPLOYEE_NAME:krish) qtxt = (some text) AND ( AUTHOR_NAME:krish OR

Re: Index Rows as Documents? Help me design a solution

2006-07-25 Thread Erick Erickson
The code looks good, *assuming* that the IndexWriter you pass in isn't closed/opened between files (this would be a problem if you have lots of files to index..). I've had the IndexWriter.optimize method take a lng time to complete, so I typically don't do this until I'm entirely done...

Re: Copying documents

2006-07-25 Thread Chris Hostetter
: I want to copy a selection of documents from one index to another. I can : get the Document objects from the IndexReader and write them to the : target index using the IndexWriter. The problem I have is this loses : fields that have not been stored, is there a way round this. there is no easy

Re: Article keyword counters

2006-07-25 Thread Chris Hostetter
1) please do not cross post to more then one lucene mailing list. the appropraites place for questions about using the Java Lucene library is java-user 2) if you want the counts of all documents matching each keyword, then the TermEnum.docFreq method can solve all of your problems. if you want

Re: Matching accented with non-accented characters

2006-07-25 Thread John Haxby
Rajan, Renuka wrote: I am trying to match accented characters with non-accented characters in French/Spanish and other Western European languages. The use case is that the users may type letters without accents in error and we still want to be able to retrieve valid matches. The one idea,

Re: Index Rows as Documents? Help me design a solution

2006-07-25 Thread Doron Cohen
Few comments - (from first posting in this thread) The indexing was taking much more than minutes for a 1 MB log file. ... I would expect to be able to index at least a of GB of logs within 1 or 2 minutes. 1-2 minutes per GB would be 30-60 GB/Hour, which for a single machine/jvm is a lot -

Re: Sorting with Parallelreader fails

2006-07-25 Thread neils
Hi Steve, thanks a lot for your help. I think problem will be that no single terms are stored in that field. So i will take a look and make some furhter tests. Regarding your advice to the java-user list, i think it is no problem to send a short vb code to describe the problem. This group is

Re: MultiFieldQueryParser.parse deprecated. What can I use?

2006-07-25 Thread Paulo Silveira
hey doron, I solved the problem with for (String field : fields) { QueryParser qp = new QueryParser(field, SearchEngine.ANALYZER); fieldsQuery.add(qp.parse(string), BooleanClause.Occur.SHOULD); } that seems to have the exact same effect of your suggestion

Re: MultiFieldQueryParser.parse deprecated. What can I use?

2006-07-25 Thread Marvin Humphrey
On Jul 25, 2006, at 7:35 PM, Paulo Silveira wrote: hey doron, I solved the problem with for (String field : fields) { QueryParser qp = new QueryParser(field, SearchEngine.ANALYZER); fieldsQuery.add(qp.parse(string), BooleanClause.Occur.SHOULD); } I believe that this will

Re: Limit number of search results

2006-07-25 Thread headhunter
Hello Miles, thanks for your answer. I guess the recommended way to implement paging of results is to do your own query-results caching, right? Or does lucene also do this for me? Johannes -- View this message in context: