Hi Diego,
There is no such thing in lucene ecosystem yet. Although some ideas
http://search-lucene.com/m/WwzTb2nt1Tk1
http://search-lucene.com/m/WwzTb2d9o2m
float time to time.
I would like to integrate https://code.google.com/p/jforests/ and create a
prototype my self in the future.
New a
Thanks for the reply.
When you mention system memory you referring to RAM (or HEAP as this is
running as a java process) ?
The index size is around 13G and the java process is not given so many
memory (in terms of XMX).
Could this be the cause? My understandint while reading some articles on
the in
Does your index fit fully in system memory - the OS file cache? If not,
there could be a lot of thrashing (I/O) as Lucene accesses the index.
-- Jack Krupansky
-Original Message-
From: Liviu Matei
Sent: Monday, May 19, 2014 4:21 PM
To: java-user@lucene.apache.org
Subject: Performance
Hi,
In order to achieve a somehow "smarter" search that takes into
consideration also the context I decided to use PhraseQuery. Now I create
~100 phrase queries from the input text and combine them with boolean query
into one query and issue a search against the index.
Now if the index size is big
Hi
I can get the payloads for query terms using getPayloadsForQuery from
PayloadSpanUtil.
However this does not support ConstantScore queries. So how do I get the
payloads for queries that get rewritten to ConstantScore query for example
PrefixQuery, WildcardQuery.
Thanks
Puneet
On Mon, 2014-05-19 at 11:54 +0200, De Simone, Alessandro wrote:
[24GB index, 8GB disk cache, only indexed fields]
> The "IO calls" I was referring to is the number of time the
> "BufferedIndexInput.refill()" function is called. So it means that we
> have 16 times more bytes read when there are 16
Out of curiosity, do any of the current crowd of Lucene commiters/users have
any insight as to how or why that seemingly obvious design requirement was
ignored or consciously avoided in the original design for Lucene? I've
always assumed that Lucene (and Solr) were originally designed for a batc
Hi,
We are using Lucene 4.7 on our server application for searching the documents
placed on a nasshare. We have 10 million+ documents and have decided not to
index all the documents. The strategy that we applied is as follows:
1. Client makes a request with a search phrase. Lucene applic
On Mon, May 19, 2014 at 6:14 AM, Clemens Wyss DEV wrote:
> Mike,
> first of all thanks for all your input, I really appreciate (as much as I
> like reading your blog).
You're welcome!
>> Hmm, but you swap these files over while an IndexReader is still open on the
>> index?
> no IndexReader is
Mike,
first of all thanks for all your input, I really appreciate (as much as I like
reading your blog).
> Hmm, but you swap these files over while an IndexReader is still open on the
> index?
no IndexReader is open while swapping. At least not by design. We have at most
one (current)reader per
I know, it's a commonly requested feature, but unfortunately it's very
complex to implement. See e.g. the discussions on
https://issues.apache.org/jira/browse/LUCENE-4258
Mike McCandless
http://blog.mikemccandless.com
On Mon, May 19, 2014 at 5:15 AM, Jamie wrote:
> Michael
>
> Thanks for the
Thank you for your input
> How much RAM does your search machine have?
We have 16GB of ram, and there is at least 8GB free memory for the OS file
cache. The cache is working pretty well.
> That sounds right. Although each segment is 1/16 of the full index size, the
> number of seeks per
segmen
Also one more thing ... sorry forgot to add by using lsof I noticed deleted
index files that are still used by the application. Is this ok? Can't this
cause issues? The IndexReader trying to access an index file that was
deleted ? I suspect the deletion happens because of index merges during
indexi
Thank you very much to all of you the answers.
Uwe this is the strange thing that I am currently never closing the index
reader and opening a new one from 8 to 8 hours and I am noticing that crash
in indeed a highly concurrent environment.
The indexes reside in a NFS file system. And the location i
Michael
Thanks for the clarification.
This is a hefty limitation of the Lucene.
One would expect, that you would be able to update a specific field in
the index without having to reindex the entire document.
Regards
Jamie
On 2014/05/16, 11:34 PM, Michael McCandless wrote:
You can retrieve
On Mon, May 19, 2014 at 4:59 AM, Clemens Wyss DEV wrote:
>> Are you using doc-values updates?
> Not to my knowledge, i.e. not explicitly
Hmm ok.
>> Are you ever removing files directly from the index directory yourself
>> between reopens?
> Yes. Reindexing an index completely(*) is done in a se
> Are you using doc-values updates?
Not to my knowledge, i.e. not explicitly
> Are you ever removing files directly from the index directory yourself
> between reopens?
Yes. Reindexing an index completely(*) is done in a separate temporary
index/folder. After that we (guarded by a mutex) swap th
Hmm, I was wrong before, the code is more complex than I thought.
Are you using doc-values updates?
Can you describe a bit more how your app works? Are you ever removing
files directly from the index directory yourself between reopens?
Mike McCandless
http://blog.mikemccandless.com
On Mon, M
(Apologies if you reveive multiple copies of this CfP.)
The 12th International Workshop on Java Technologies for
Real-time and Embedded Systems - JTRES 2014
October 13th - 14th
Niagara Falls, NY, USA
19 matches
Mail list logo