Hello,
I have got lot of personal emails for sharing the Lucene Investigation
document. It is not possible to reply each of the Emails. So I am putting
this document inside my briefcase. Anyone interested please go to following
site and get the document.
did someone delete the shared doc ?
[EMAIL PROTECTED] wrote:
Hello,
I have got lot of personal emails for sharing the Lucene Investigation
document. It is not possible to reply each of the Emails. So I am putting
this document inside my briefcase. Anyone interested please go to following
site
When I am adding a document to the lucene index if the method throws an
IOException and if I continue with adding other documents ignoring the
exception, will the index be corrupted? What happens to the fields which are
already written to the index?
When I went there, I got a message that there were no shared folders in
the brief case.
It never gave me an opportunity to enter the password.
Thanks.
Bill Taylor
On Oct 12, 2006, at 6:34 AM, sachin wrote:
Hello,
I have got lot of personal emails for sharing the Lucene
Investigation
Hello,
This is a design question: For Lucene to be able to process a million
documents and in the purpose for the search application to be scalable
and still have a good response time do we need to use an EJB container
such as Weblogic or is a Servlet container such as Tomcat sufficient to
do the
go to http://briefcase.yahoo.com/pickupartistmistry
click on login
enter user pickupartistmistry
password: chotachetan
the document should be there
-tom
Bill Taylor wrote:
When I went there, I got a message that there were no shared folders
in the brief case.
It never gave me an opportunity
EJB explicitly precludes you from accessing files, including via third party
libraries such as Lucene.
http://java.sun.com/blueprints/qanda/ejb_tier/restrictions.html
In practice you may be able to get away with it but I see no particular reasons
why using an EJB server should offer any
On Oct 12, 2006, at 10:17 AM, Apache Lucene wrote:
When I am adding a document to the lucene index if the method
throws an
IOException and if I continue with adding other documents ignoring the
exception, will the index be corrupted? What happens to the fields
which are
already written to
IN THEORY, EJB containers are better able than Tomcat to spread
incoming requests over a multitude of servers. There was considerable
discussion some time ago about index search speed on a single
processor. I do not remember the details, but there was some
information about how fast a
For example in the following statement
doc.add(new Field(contents, parser.getReader(), Field.TermVector.YES));
The reader is causing the IOException when internally invertDocument()
method is called where tokenstream is generated from the reader. I am not
worried if the document info is
Hi folks,
I am using Lucene 2.0
In our application, I am indexing a stream of documents. Each document is
fairly small ( 1 KB), but there can be 10's of millions of documents. Each
document has a Timestamp field. Users can enter free-form searches and a
date/time range. They are most
You really should be using the same IndexSearcher for successive
searches. Sorting works best when done with a warm searcher. Have
a look at Solr's warming strategy, and consider adopting that in some
way.
Erik
On Oct 12, 2006, at 3:04 PM, [EMAIL PROTECTED] wrote:
Hi folks,
Supposed I want to index 500,000 documents (average document size is
4kBs). Let's assume I create a single index and that the index is
static (I'm not going to add any new documents to it). I would guess
the index would be around 2GB.
Now, I do searches against this on a somewhat beefy
Scott Smith [EMAIL PROTECTED] wrote on 12/10/2006 14:14:57:
Supposed I want to index 500,000 documents (average document size is
4kBs). Let's assume I create a single index and that the index is
static (I'm not going to add any new documents to it). I would guess
the index would be around
I'm developing an application used by scientists -- people who have a pretty
good idea of what logic is -- and they were shocked to find out that neither
of these queries return the same results:
1- banana AND apple OR orange
2- banana AND (apple OR orange)
3- (banana AND apple) OR orange
I'd
There is also the Surround Query Parser in contrib by the way...I would bet
that Paul will tell you that it does not have these issues. I can't wait to
see the replies on this one...I didn't realize that the QueryParser had
these problems and am a bit skeptical...unfortunately I am away from home
Renaud Waldura wrote:
While we are also developing a query-building UI, users must be able to
enter text queries as well. What do other folks do? I mean, this is
pretty bad. I can hardly go back to my scientists and tell them Lucene
is unable to handle 2 boolean operators, that they should
On Oct 12, 2006, at 7:11 PM, Renaud Waldura wrote:
I'm developing an application used by scientists -- people who have
a pretty good idea of what logic is -- and they were shocked to
find out that neither of these queries return the same results:
1- banana AND apple OR orange
2- banana AND
Thanks, Erik for the pointer to Solr.
Since the document index is added to frequently, creating new IndexSearchers is
required anyway. We plan to 'age' out already created IndexSearcher and create
new ones every few minutes. Solr's cache regeneration would be useful in this
scenario.
Does the
I think a standalone J2EE application will be good and better loose
coupling than EJB. You can seperate memory, disk, and CPU resources
from your main application. You can send results back in XML, JSON, or
other formats.
Chris Lu
-
Instant Full-Text Search On Any
Lots of memory will help a lot. I have a customer of DBSight and he is
using Intel Core Duo, and configure everything in memory. The index
size is about 700M. When I checked his system's average response time,
it's 12ms! I guess you can estimate what you will get from your beefy
machine.
So it
Gecko? ;)
My advice: stay away from EJBs as much as you can. They are too complicated
and too heavy for most systems. Servlet containers like Jetty, Tomcat, or
Resin are often perfectly suitable for the job and a lot simpler.
Otis
- Original Message
From: Chenini, Mohamed [EMAIL
Hi All,
I am facing a peculiar problem.
I am trying to index a file and the indexing code executes without any error
but when I try to close the indexer, I get the following error and the error
comes very rarely but when it does, no code on document indexing works and I
finally have to delete
23 matches
Mail list logo