Re: Phrase Query

2008-09-16 Thread Antony Bowesman
Is it possible to write a document with different analyzers in different fields? PerFieldAnalyzerWrapper - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Some SSD results to share

2008-09-16 Thread Eric Bowman
Hi all, We stuck a 60 GB OCZ "Core Series" SSD in a Dell T5400 (dual quadcore, 16GB RAM, SATA II 7200 RPM disk) and did some comparisons between running with our index on disk, vs. on SSD. I can't really talk about what the app does, but I can share the difference in performance; see enclose

Re: Some SSD results to share

2008-09-16 Thread Eric Bowman
Eric Bowman wrote: Hi all, We stuck a 60 GB OCZ "Core Series" SSD in a Dell T5400 (dual quadcore, 16GB RAM, SATA II 7200 RPM disk) and did some comparisons between running with our index on disk, vs. on SSD. I can't really talk about what the app does, but I can share the difference in perfo

Lucene & Zend Lucene Search : indexation speed, document parsing

2008-09-16 Thread Romain de Wolff
Hi all, Well this is my first post on this list. Nice to meet you all. Im currently putting in place a system which index data with Apache Lucene (indexing doc, xls, pdf, source code, zip files, ...) and allow searching with the Zend Lucene Search library (PHP). Im planning to create a fro

Re: Some SSD results to share

2008-09-16 Thread Michael McCandless
I'd be curious how much of a performance difference you see between the different SSD drives. This test was with OCZ "Core Series" but previous tests were with different SSDs right? EG Intel just released a new 80 GB SSD (X25-M) which is getting rave reviews, even compared to the OCZ "c

Re: Some SSD results to share

2008-09-16 Thread Eric Bowman
Michael McCandless wrote: I'd be curious how much of a performance difference you see between the different SSD drives. This test was with OCZ "Core Series" but previous tests were with different SSDs right? EG Intel just released a new 80 GB SSD (X25-M) which is getting rave reviews, eve

Re: Lucene & Zend Lucene Search : indexation speed, document parsing

2008-09-16 Thread Julien Nioche
Bonjour Romain, Im asking myself a few questions. Mainly about speed (indexation time) and > document parsing (way to index most of commonly used office documents). For > document parsing, I'm planning to use different open sources library. The > company Im doing this for will be indexing a few

Re: Some SSD results to share

2008-09-16 Thread Karl Wettin
Related, I've been considering filesystem based filters on SSD. That ought to be rather fast, consume no memory and be as simple as a RandomAccessFile. I didn't spend to much time on it, gave up when I couldn't figure out when it made sense to close the file. Perhaps it would be nice with a

Re[2]: Frequently updated fields

2008-09-16 Thread Wojciech Strzałka
I saw your comments on JIRA. You mentioned about rework and I'm wondering if the currently available patch is production ready (functionally complete)? Will the code after rework work with the index build with the current version? I'm quite new to SOLR/Lucene but I hope I could write custom

Issues with Special Characters

2008-09-16 Thread miztaken
Hi there, I am using WhiteSpaceAnalyser to index documents. I have used this because i need to split tokens based on space only. Also Tokensized=true While indexing what does it do with special characters like + - && || ! ( ) { } [ ] ^ " ~ * ? : \, will these characters be indexed or will be chopp

Using Hits as document space for new search

2008-09-16 Thread nukie
Hi! I'm writing application that should be using lucene for searching threw ~200 documents. Search criterias i'm collecting using BooleanQuery and ChainedFilter. Avarage search takes 150 ms, what is acceptable in my case. But i should have hits count for ~20 more variants of search criteria

Re: Issues with Special Characters

2008-09-16 Thread Erick Erickson
You can easily answer the questions about what WhitespaceTokenizer produces by getting a copy of Luke and looking at your index. Or writing a really simple test program that prints out tokens. At the bottom of this page is a list of special characters for escaping: http://lucene.apache.org/java/do

Re: Using Hits as document space for new search

2008-09-16 Thread Erick Erickson
I'm confused about what your "variants of search criterias" are. Could you provide a few examples? Also, if you can provide a higher-level statement of your problem, folks can often come up with alternate approaches. Best Erick On Tue, Sep 16, 2008 at 9:20 AM, nukie <[EMAIL PROTECTED]> wrote: >

Re: Re[2]: Frequently updated fields

2008-09-16 Thread Jason Rutherglen
Hi Wojciech, The code isn't ready, it is a major project and I am trying to also complete the realtime indexing patches and look for a job. I believe that the tag indexing stuff is of interest to many people so if there is someone who can pay to get it completed feel free to contact me as I am av

Re: Issues with Special Characters

2008-09-16 Thread miztaken
Hi there, I will check that out but what do you suggest for searching?? without escaping works for query string "fw: fyi.dat" but i have to escape : char for query string "fw:" so i am having two cases? Please help me Erick Erickson wrote: > > You can easily answer the questions about what W

Re: Issues with Special Characters

2008-09-16 Thread Erick Erickson
I question whether your example actually searches what you think. What I suggest is that you get a copy of Luke and look at what your queries actually produce. That'll give you a much better idea of what happens under the covers. Also, query.toString() is your friend. Try printing out your queries

Using separate index for each user

2008-09-16 Thread Tobias Larsson Hult
Hi, We're thinking of using Lucene to integrate search in a backup service application. The background is that we have a bunch of users using a backup service, and we want them to be able to search their own, and only their own, backups. The total amount of data that's being backed up is

Re: Using separate index for each user

2008-09-16 Thread Erick Erickson
The main arguments against using many separate indexes are 1> search warmup time. That is, each time you open an index the first few queries take much longer than subsequent searches. 2> Managing a bazillion indexes is non-trivial. That said, in your particular case these may not apply. I gu

Re: Using separate index for each user

2008-09-16 Thread Otis Gospodnetic
Tobias, That's the approach I took with Simpy.com and it's been working well for several years now. You'll have to keep track of searchers and close them when appropriate, of course. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Tobias

Re: Phrase Query

2008-09-16 Thread Otis Gospodnetic
Are the terms stopwords? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Cam Bazz <[EMAIL PROTECTED]> > To: java-user@lucene.apache.org > Sent: Tuesday, September 16, 2008 1:33:48 AM > Subject: Phrase Query > > Hello, > > Lets say I have

Re: Issues with Special Characters

2008-09-16 Thread miztaken
Hi, I tested sample application with Luke as well. I am using .NEt Version of Lucene (2.0.0.4) and i think i am getting error due to that. When i tested my queries with luke then its working fine and getting me output as desired but then i used lucene API available for .NET then its producing er

Re: Issues with Special Characters

2008-09-16 Thread Erick Erickson
Um, ask over on the .NET user group? Erick On Tue, Sep 16, 2008 at 12:20 PM, miztaken <[EMAIL PROTECTED]> wrote: > > Hi, > I tested sample application with Luke as well. > I am using .NEt Version of Lucene (2.0.0.4) and i think i am getting error > due to that. > > When i tested my queries with

Re: IndexSearcher.search

2008-09-16 Thread Chris Hostetter
: Related topic: what if we need all the hits and not just the first 100? Solr has a FAQ related to this that i think also applies here.. >How can I get ALL the matching documents back? ... How can I return an >unlimited number of rows? > >This is impractical in most cases. People typically onl

RE: Re: Replacing FAST functionality atsesam.no-ShingleFilter+exactmatching

2008-09-16 Thread Chris Hostetter
: The query parser expects you to assign positionIncrement=0 for synonyms : in this manner. correct. : The one kludge i see is that the QueryParser expects the total positions : found to be greater than or equal to one. It might not be intentionally : dealing with the total position count being

IndexReader.isCurrent()

2008-09-16 Thread rahul_k123
what is the behaviour of IndexReader.current() if i modify the index manually? Will it returns false? One more question The index is on Linux if my indexReader is open and some of the files in index are deleted, what is the behaviour of this? Will it give any exception like File Not found??

Re: IndexSearcher.search

2008-09-16 Thread Daniel Noll
Chris Hostetter wrote: : Or do we make a replacement for TopDocCollector which doesn't have this : drawback, and uses an alternative for PriorityQueue which allows its array to : grow? I don't see that as being much better -- you still wouldn't want to pass MAX_INT because waht if there really a

jre installing problem in ubuntu

2008-09-16 Thread Aditi Goyal
Hi All, I'm running Ubuntu Linux 8.04, which has the sun jre installed in the system area (/usr/lib/jvm/java-6-sun-1.6.0.06/jre/lib/). I've manually installed an updated JRE (jre1.6.0_07) into /usr/local area (/usr/local/lib/jre1.6.0_07/lib/), and my PATH environment variable is updated to use th