[
https://issues.apache.org/jira/browse/LUCENE-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Fertig updated LUCENE-1381:
-
Description:
With several older lucene projects already running and "stable", I have
recently w
Hanging while indexing/digesting on multiple threads
Key: LUCENE-1381
URL: https://issues.apache.org/jira/browse/LUCENE-1381
Project: Lucene - Java
Issue Type: Bug
Components: An
[
https://issues.apache.org/jira/browse/LUCENE-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12630091#action_12630091
]
robert engels commented on LUCENE-1195:
---
Also, SafeThreadLocal can be trivially chan
You can't hold the ThreadLocal value in a WeakReference, because
there is no hard reference between enumeration calls (so it would be
cleared out from under you while enumerating).
All of this occurs because you have some objects (readers/segments
etc.) that are shared across all threads, b
When I look at the reference tree That is the feeling I get. if you
held a WeakReference it would get released .
|- base of org.apache.lucene.index.CompoundFileReader$CSIndexInput
|- input of org.apache.lucene.index.SegmentTermEnum
|- value of java.lang.ThreadLocal$
[
https://issues.apache.org/jira/browse/LUCENE-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
robert engels updated LUCENE-1195:
--
Attachment: SafeThreadLocal.java
A "safe" ThreadLocal that can be used for more deterministic
[
https://issues.apache.org/jira/browse/LUCENE-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless resolved LUCENE-1366.
Resolution: Fixed
Committed revision 694004.
> Rename Field.Index.UN_TOKENIZED/TO
Looking forward to 2.4!
-John
On Tue, Sep 9, 2008 at 2:38 AM, Michael McCandless <
[EMAIL PROTECTED]> wrote:
>
> OK we are gradually whittling down the list. It's down to 9 issues now.
>
> I have 2 issues, Grant has 3, Otis has 2 and Mark and Karl have 1 each.
>
> Can each of you try to finish
Hi guys:
We have build this on top of the lucene 1.4. api/refactoring for docid
sets and docIdIterater.
We've implemented the p4Delta compression algorithm presented at
www2008: http://www2008.org/papers/fp618.html
We've been using this in production here at LinkedIn and would lov
Sorry, I meant lucene 2.4
-John
On Wed, Sep 10, 2008 at 2:08 PM, John Wang <[EMAIL PROTECTED]> wrote:
> Hi guys:
>
> We have build this on top of the lucene 1.4. api/refactoring for docid
> sets and docIdIterater.
>
> We've implemented the p4Delta compression algorithm presented at
> w
[
https://issues.apache.org/jira/browse/LUCENE-1344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12629954#action_12629954
]
Nicolas Lalevée commented on LUCENE-1344:
-
About the missing header in the maven j
Well, the code is correct, because it can work by avoiding this trap. But it
failed to act as a good API.
I learned the inside details from you. I am not the only one that's trapped.
And more users will likely be trapped again, unless javadoc to describe the
close() function is changed. Actually,
Hi Mike,
There would be a new sorted list or something to replace the
hashtable? Seems like an issue that is not solved.
Jason
On Tue, Sep 9, 2008 at 5:29 AM, Michael McCandless
<[EMAIL PROTECTED]> wrote:
>
> This would just tap into the live hashtable that DocumentsWriter* maintain
> for the p
Always your prerogative.
On Sep 10, 2008, at 1:15 PM, Chris Lu wrote:
Actually I am done with it by simply downgrading and not to use
r659602 and later.
The old version is more clean and consistent with the API and close
() does mean close, not something complicated and unknown to most
user
Actually I am done with it by simply downgrading and not to use r659602 and
later.The old version is more clean and consistent with the API and close()
does mean close, not something complicated and unknown to most users, which
almost feels like a trap. And later on, if no changes happened for this
Why not just use reopen() and be done with it???
On Sep 10, 2008, at 12:48 PM, Chris Lu wrote:
Yeah, the timing is different. But it's an unknown, undetermined,
and uncontrollable time...
We can not ask the user,
while(memory is low){
sleep(1000);
}
do_the_real_thing_an_hour_later
--
Ch
Yeah, the timing is different. But it's an unknown, undetermined, and
uncontrollable time...
We can not ask the user,
while(memory is low){
sleep(1000);
}
do_the_real_thing_an_hour_later
--
Chris Lu
-
Instant Scalable Full-Text Search On Any Database/Application
site:
SafeThreadLocal is very interesting. It'll be good not only for Lucene, but
also other projects.
Could you please post it?
--
Chris Lu
-
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Datab
Not likely. Actually I made some changes to Lucene source code and I can see
the changes in the memory snapshot. So it is the latest Lucene version.
--
Chris Lu
-
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.
Close() does work - it is just that the memory may not be freed until
much later...
When working with VERY LARGE objects, this can be a problem.
On Sep 10, 2008, at 12:36 PM, Chris Lu wrote:
Thanks for the analysis, really appreciate it, and I agree with it.
But...
This is really a normal
Not holding searcher/reader. I did check that via memory snapshot.
--
Chris Lu
-
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes:
http://wiki.dbsight.com/index.ph
Thanks for the analysis, really appreciate it, and I agree with it. But...
This is really a normal J2EE use case. The threads seldom die.
Doesn't that mean closing the RAMDirectory doesn't work for J2EE
applications?
And only reopen() works?
And close() doesn't release the resources? duh...
I can
The other thing Lucene can do is create a SafeThreadLocal - it is
rather trivial, and have that integrate at a higher-level, allowing
for manual clean-up across all threads.
It MIGHT be a bit slower than the JDK version (since that uses
heuristics to clear stale entries), and so doesn't al
My review of truck, show a SegmentReader, contains a TermInfosReader,
which contains a threadlocal of ThreadResources, which contains a
SegmentTermEnum.
So there should be a ThreadResources in the memory profiler for each
SegmentTermEnum instances - unless you have something goofy going on.
[
https://issues.apache.org/jira/browse/LUCENE-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12629846#action_12629846
]
Michael Semb Wever commented on LUCENE-1380:
i suspected such re the option na
You do not need to create a new RAMDirectory - just write to the
existing one, and then reopen() the IndexReader using it.
This will prevent lots of big objects being created. This may be the
source of your problem.
Even if the Segment is closed, the ThreadLocal will no longer be
referenc
Good question.
As far as I can tell, nowhere in Lucene do we put a SegmentTermEnum
directly into ThreadLocal, after rev 659602.
Is it possible that output came from a run with Lucene before rev
659602?
Mike
Chris Lu wrote:
Is it possible that some other places that's using SegmentTermE
Chris,
After you close your IndexSearcher/Reader, is it possible you're still
holding a reference to it?
Mike
Chris Lu wrote:
Frankly I don't know why TermInfosReader.ThreadResources is not
showing up in the memory snapshot.
Yes. It's been there for a long time. But let's see what's ch
Is it possible that some other places that's using SegmentTermEnum as
ThreadLocal?This may explain why TermInfosReader.ThreadResources is not in
the memory snapshot.
--
Chris Lu
-
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
de
I am really want to find out where I am doing wrong, if that's the case.
Yes. I have made certain that I closed all Readers/Searchers, and verified
that through memory profiler.
Yes. I am creating new RAMDirectory. But that's the problem. I need to
update the content. Sure, if no content update an
Actually, a single RAMDirectory would be sufficient (since it
supports writes). There should never be a reason to create a new
RAMDirectory (unless you have some specialized real-time search
occuring).
If you are creating new RAMDirectories, the statements below hold.
On Sep 10, 2008, at 1
It is basic Java. Threads are not guaranteed to run on any sort of
schedule. If you create lots of large objects in one thread,
releasing them in another, there is a good chance you will get an OOM
(since the releasing thread may not run before the OOM occurs)...
This is not Lucene specifi
Frankly I don't know why TermInfosReader.ThreadResources is not showing up
in the memory snapshot.
Yes. It's been there for a long time. But let's see what's changed : A LRU
cache of termInfoCache is added.
I SegmentTermEnum previously would be released, since it's relatively a
simple object.
But
I do not believe I am making any mistake. Actually I just got an email from
another user, complaining about the same thing. And I am having the same
usage pattern.
After the reader is opened, the RAMDirectory is shared by several objects.
There is one instance of RAMDirectory in the memory, and it
Does this make any difference?If I intentionally close the searcher and
reader failed to release the memory, I can not rely on some magic of JVM to
release it.
--
Chris Lu
-
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: ht
[
https://issues.apache.org/jira/browse/LUCENE-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12629827#action_12629827
]
Steven Rowe commented on LUCENE-1380:
-
As I said in the thread on java-user that spawn
Sorry, but I am fairly certain you are mistaken.
If you only have a single IndexReader, the RAMDirectory will be
shared in all cases.
The only memory growth is any buffer space allocated by an IndexInput
(used in many places and cached).
Normally the IndexInput created by a RAMDirectory d
[
https://issues.apache.org/jira/browse/LUCENE-1320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Ingersoll updated LUCENE-1320:
Attachment: LUCENE-1320.patch
Java 1.4 compatible. Give this a try
> ShingleMatrixFilter
[
https://issues.apache.org/jira/browse/LUCENE-1320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12629798#action_12629798
]
Grant Ingersoll commented on LUCENE-1320:
-
I'm almost done w/ a conversion. Regex
[
https://issues.apache.org/jira/browse/LUCENE-1320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12629763#action_12629763
]
Karl Wettin commented on LUCENE-1320:
-
It really is quite a bit of work to downgrade t
Why do you need to keep a strong reference?
Why not a WeakReference ?
--Noble
On Wed, Sep 10, 2008 at 12:27 AM, Chris Lu <[EMAIL PROTECTED]> wrote:
> The problem should be similar to what's talked about on this discussion.
> http://lucene.markmail.org/message/keosgz2c2yjc7qre?q=ThreadLocal
>
> Th
[
https://issues.apache.org/jira/browse/LUCENE-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Semb Wever updated LUCENE-1380:
---
Attachment: LUCENE-1380.patch
Addition to ShingleFilter for property coterminalPosit
Patch for ShingleFilter.coterminalPositionIncrement
---
Key: LUCENE-1380
URL: https://issues.apache.org/jira/browse/LUCENE-1380
Project: Lucene - Java
Issue Type: Improvement
Componen
I still don't quite understand what's causing your memory growth.
SegmentTermEnum insances have been held in a ThreadLocal cache in
TermInfosReader for a very long time (at least since Lucene 1.4).
If indeed it's the RAMDir's contents being kept "alive" due to this,
then, you should have a
44 matches
Mail list logo