What version of the log4j jar are you using?
-Original Message-
From: Don Vaillancourt [mailto:[EMAIL PROTECTED]
Sent: Tuesday, June 29, 2004 8:06 AM
To: Lucene Users List
Subject: PDFBox Issue
Hi all,
I know that this is a Lucene list but wanted to know if any of you have
What Analyzer is being used? If it is removing stop words, what is the
stop word list?
Erik
On Aug 17, 2004, at 1:56 AM, Leos Literak wrote:
One user reported, that if he searches http AND halt,
the search fails. This can be found in logs:
java.lang.ArrayIndexOutOfBoundsException: -1
Wow, this is an old message.
I managed to get my code to work by using the previous version of
PDFBox. I had used the version of log4j that had come with PDFBox.
Someone had mentioned recompiling log4j, but I couldn't get the project
to import the source into Eclipse, so I gave up. But
Hey Guys.
Apologies..
Some small Help needed
When I Run the Analyzer's for the word New Year (with Quotes) on
Lucene1-4 final.jar on win 2k O/s
Why is the SimpleAnalyzer splitting it into 2 words ???
or
am i missing something in here..
Analzying New Year
PDFBox comes with log4j version 1.2.5(according to MANIFEST.MF in jar
file), I believe that 1.2.8 is the latest. I will make sure that the next
version of PDFBox includes the latest log4j version, which I assume is
what everybody would like to use.
But, by looking at the below error message it
This is what analyzers do. I don't know of any analyzer that deals
with quotes in the way you're requesting, by keeping the contents
together as a complete token. You'll have to write your own variant
that does this.
QueryParser, however, uses quotes to denote a phrase query, and will
query
Anything is possible.
In a couple of weeks I may be upgrading my code to use Lucene 1.4 and I
will make an attempt to use the latest version of PDFBox.
You may be right about log4j being somewhere else in the classpath, but
being a jar for Jakarta, I couldn't think of any apps on my desktop
Hi
Erik
Apologies...
What I ment to Say was, a word such as New Year (Quotes means \ )
on QueryParser.parse(word, contents, analyzer) should return me hits
for the full word,
but it did not.
So when I did a quick run on Analyzer process and
found that it was splitting the
Karthik,
What you would want to do with the split tokens ( New and Year )
is then create a PhraseQuery containing a Term object for each token.
This should do what you want. As Erik said, QueryParser would have
done this internally, only if you actually sent in the quotes...not
just New Year, but
Further on this, Karthik, is that you need to really understand what
you indexed. For example... take a document that has New Year in it,
and follow it through your indexing process. See what your analyzer at
indexing time actually indexed. And if new year are side-by-side
tokens emitted
On Aug 17, 2004, at 9:23 AM, Karthik N S wrote:
So when I did a quick run on Analyzer process and
found that it was splitting the Word
New Year = [New] [Year]
Am I doing some thing wrong in here
No... this is what this analyzer does. QueryParser does the same
thing. The difference
Hi
Patrick
I did as Erik replied in his mail ,
and searched for the complete word \New Year\ ,
but the QueryParser Still returns me hit for Year Only.
[ The Analyzer I use has 555 English Stop words with new present in it ]
That's when I checked up with Analyzer's to verify,
If u look
On Aug 17, 2004, at 9:47 AM, Karthik N S wrote:
I did as Erik replied in his mail ,
and searched for the complete word \New Year\ ,
but the QueryParser Still returns me hit for Year Only.
[ The Analyzer I use has 555 English Stop words with new present
in it ]
No wonder!
That's when I
Wallen,
Which hex editor have you used. I am also facing a
similar problem. I tried to use KHexEdit and it
doesn't seem to help. I am attaching with this email
my segments file. I think only the segment with name
_ung is a valid one, I wanted to delete the
remaining..but couldn't. Can you help?
I think attachments are filtered. This is what I see
when I open in the hex editor.
: 00 04 e0 af 00 00 00 02 05 5f 36 75 6e 67 00
04 ..à¯._6ung..
:0010 1e fb 05 5f 36 75 6e 69 00 00 00 01 00 00 00
00 .û._6uni
:0020 00 00 c1 b4
..Á´
http://www.ultraedit.com/ is the best!
However, I cannot imagine how another hexeditor wouldnt work.
-Original Message-
From: Honey George [mailto:[EMAIL PROTECTED]
Sent: Tuesday, August 17, 2004 10:35 AM
To: Lucene Users List
Subject: RE: Restoring a corrupt index
Wallen,
Which hex
Change 02 to be 01 and delete the bytes that represent the one record that
is bad. It was easier to see what a record was in my file because I had
about 30 _files.
-Original Message-
From: Honey George [mailto:[EMAIL PROTECTED]
Sent: Tuesday, August 17, 2004 10:39 AM
To: Lucene Users
Hi Guys
Apologies..
Correct me If I am wrong...
During Indexing process, if the Analyzer has a word 'new' in the array
' STOPWORD' this word is prevented from indexing or
Stopped from indexing.
Then during the process of Search would not return me a hit on the word
New
Hmm, while I agree that UltraEdit is the best on Windows, since they
were using KHexEdit, I doubt it's an option for them on Linux
(although I do know it runs fine under Wine).
Patrick
On Tue, 17 Aug 2004 10:39:27 -0400, [EMAIL PROTECTED]
[EMAIL PROTECTED] wrote:
http://www.ultraedit.com/ is
I believe that is correct. So, the word new is never being indexed
since it is a stop word.
Patrick
On Tue, 17 Aug 2004 20:26:19 +0530, Karthik N S
[EMAIL PROTECTED] wrote:
Hi Guys
Apologies..
Correct me If I am wrong...
During Indexing process, if the Analyzer has a word
Forward back to list.
-- Forwarded message --
From: Patrick Burleson [EMAIL PROTECTED]
Date: Tue, 17 Aug 2004 11:30:19 -0400
Subject: Re: Swapping Indexes?
To: Stephane James Vaucher [EMAIL PROTECTED]
Stephane,
Thank you for the ideas. I'm going about implenting idea 1 (I like
Hello Lucene developers
A litle issue about a Field documentation.
In Field class on getBoost() method it says:
Returns the boost factor for hits on any field of this document.
I think that this comment are copied from Document class and forgot change
it.
Bye
Ernesto.
---
Outgoing mail is
On Tue, 17 Aug 2004, Patrick Burleson wrote:
Forward back to list.
-- Forwarded message --
From: Patrick Burleson [EMAIL PROTECTED]
Date: Tue, 17 Aug 2004 11:30:19 -0400
Subject: Re: Swapping Indexes?
To: Stephane James Vaucher [EMAIL PROTECTED]
Stephane,
Thank you
On Tue, 17 Aug 2004 13:17:10 -0400 (EDT), Stephane James Vaucher
Actually, I use a IndexWriter in overwrite mode on the master dir and
merge the temp dir. This cleans up the old master.
I'm a bit of a Lucene newbie here, and I am trying to understand what
you mean by merge the temp dir? Do
On Tue, 17 Aug 2004, Patrick Burleson wrote:
On Tue, 17 Aug 2004 13:17:10 -0400 (EDT), Stephane James Vaucher
Actually, I use a IndexWriter in overwrite mode on the master dir and
merge the temp dir. This cleans up the old master.
I'm a bit of a Lucene newbie here, and I am trying
Thanks Ernesto, I fixed it.
Otis
--- Ernesto De Santis [EMAIL PROTECTED] wrote:
Hello Lucene developers
A litle issue about a Field documentation.
In Field class on getBoost() method it says:
Returns the boost factor for hits on any field of this document.
I think that this comment
I actually thought it might have been trying to use the log4j 1.3 'alpha'
build (there is no 'alpha' build yet, but notionally the latest HEAD isn't
too far from it). There has been a subtle change to log4j in recent months
that could have a similar impact.
Cheers,
Paul Smith
-Original
Hi All,
I am getting a OutOfMemoryError when I deploy my EJB application. To debug the
problem, I wrote the following test program:
public static void main(String[] args) {
try {
Query query = getQuery();
for (int i=0; i1000; i++) {
Sorry. I should make it more clear in my last email. I have implemented an EJB Session
Bean executing the Lucene search. At the beginning, the session been is working fine.
It returns the correct search results to me. As more and more search requests being
processed, the server ends up having
On Wednesday 18 August 2004 00:30, Terence Lai wrote:
if (fsDir != null) {
try {
is.close();
} catch (Exception ex) {
}
}
You close is here again, not fsDir. Also, it's a good idea to never ignore
exceptions, you should at least print them
Thanks for pointing this out. Even I fixed the code to close the fsDir and also add
the ex.printStackTrace(System.out), I am still hitting the OutOfMemeoryError.
Terence
On Wednesday 18 August 2004 00:30, Terence Lai wrote:
if (fsDir != null) { try {
is.close();
31 matches
Mail list logo