[jira] Created: (LUCENE-1816) exampel code in overview.html uses deprecated syntax

2009-08-17 Thread Daniel Naber (JIRA)
exampel code in overview.html uses deprecated syntax


 Key: LUCENE-1816
 URL: https://issues.apache.org/jira/browse/LUCENE-1816
 Project: Lucene - Java
  Issue Type: Bug
  Components: Javadocs
Affects Versions: 2.9
Reporter: Daniel Naber
Priority: Minor


The examples should use non-deprecated syntax only. Im' attaching a patch, but 
other parts of that page might also be out-of-date, which I didn't check now.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1816) example code in overview.html uses deprecated syntax

2009-08-17 Thread Daniel Naber (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Naber updated LUCENE-1816:
-

Summary: example code in overview.html uses deprecated syntax  (was: 
exampel code in overview.html uses deprecated syntax)

 example code in overview.html uses deprecated syntax
 

 Key: LUCENE-1816
 URL: https://issues.apache.org/jira/browse/LUCENE-1816
 Project: Lucene - Java
  Issue Type: Bug
  Components: Javadocs
Affects Versions: 2.9
Reporter: Daniel Naber
Priority: Minor
 Attachments: overview.diff


 The examples should use non-deprecated syntax only. Im' attaching a patch, 
 but other parts of that page might also be out-of-date, which I didn't check 
 now.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1816) exampel code in overview.html uses deprecated syntax

2009-08-17 Thread Daniel Naber (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Naber updated LUCENE-1816:
-

Attachment: overview.diff

 exampel code in overview.html uses deprecated syntax
 

 Key: LUCENE-1816
 URL: https://issues.apache.org/jira/browse/LUCENE-1816
 Project: Lucene - Java
  Issue Type: Bug
  Components: Javadocs
Affects Versions: 2.9
Reporter: Daniel Naber
Priority: Minor
 Attachments: overview.diff


 The examples should use non-deprecated syntax only. Im' attaching a patch, 
 but other parts of that page might also be out-of-date, which I didn't check 
 now.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1472) DateTools.stringToDate() can cause lock contention under load

2008-12-01 Thread Daniel Naber (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12652085#action_12652085
 ] 

Daniel Naber commented on LUCENE-1472:
--

Could you try changing the code to create a new object every time and then run 
your load test again? We original did that but it was slower, at least 
according to this commit comment from two years ago:

Don't re-create SimpleDateFormat objects, use static ones instead. Gives about 
a 2x performance increase in a micro benchmark.


 DateTools.stringToDate() can cause lock contention under load
 -

 Key: LUCENE-1472
 URL: https://issues.apache.org/jira/browse/LUCENE-1472
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Affects Versions: 2.3.2
Reporter: Mark Lassau
Priority: Minor

 Load testing our application (the JIRA Issue Tracker) has shown that threads 
 spend a lot of time blocked in DateTools.stringToDate().
 The stringToDate() method uses a singleton SimpleDateFormat object to parse 
 the dates.
 Each call to SimpleDateFormat.parse() is *synchronized* because 
 SimpleDateFormat is not thread safe.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-858) link from Lucene web page to API docs

2008-05-18 Thread Daniel Naber (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12597847#action_12597847
 ] 

Daniel Naber commented on LUCENE-858:
-

Otis, what I meant was a link directly from the main contents of the page to 
the API doc sub-page. I guess you refer to the navigation bar on the left, 
don't you?

this was discussed on the mailing list at
[http://mail-archives.apache.org/mod_mbox/lucene-java-dev/200704.mbox/200704062300.11996%40danielnaber.de]

 link from Lucene web page to API docs
 -

 Key: LUCENE-858
 URL: https://issues.apache.org/jira/browse/LUCENE-858
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Daniel Naber
Assignee: Grant Ingersoll

 There should be a way to link from e.g. 
 http://lucene.apache.org/java/docs/gettingstarted.html to the API docs, but 
 not just to the start page with the frame set but to a specific page, e.g. 
 this:
 http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/overview-summary.html#overview_description
 To make this work a way to set a relative link is needed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Closed: (LUCENE-1174) outdated information in Analyzer javadoc

2008-03-01 Thread Daniel Naber (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Naber closed LUCENE-1174.


   Resolution: Fixed
Fix Version/s: 2.4

committed

 outdated information in Analyzer javadoc
 

 Key: LUCENE-1174
 URL: https://issues.apache.org/jira/browse/LUCENE-1174
 Project: Lucene - Java
  Issue Type: Bug
  Components: Javadocs
Affects Versions: 2.3
Reporter: Daniel Naber
Priority: Minor
 Fix For: 2.4

 Attachments: analyzer-javadoc.diff


 I'm sure you find more ways to improve the javadoc, so feel free to change 
 and extend my patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-1174) outdated information in Analyzer javadoc

2008-02-11 Thread Daniel Naber (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Naber updated LUCENE-1174:
-

Attachment: analyzer-javadoc.diff

 outdated information in Analyzer javadoc
 

 Key: LUCENE-1174
 URL: https://issues.apache.org/jira/browse/LUCENE-1174
 Project: Lucene - Java
  Issue Type: Bug
  Components: Javadocs
Affects Versions: 2.3
Reporter: Daniel Naber
Priority: Minor
 Attachments: analyzer-javadoc.diff


 I'm sure you find more ways to improve the javadoc, so feel free to change 
 and extend my patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Created: (LUCENE-1174) outdated information in Analyzer javadoc

2008-02-11 Thread Daniel Naber (JIRA)
outdated information in Analyzer javadoc


 Key: LUCENE-1174
 URL: https://issues.apache.org/jira/browse/LUCENE-1174
 Project: Lucene - Java
  Issue Type: Bug
  Components: Javadocs
Affects Versions: 2.3
Reporter: Daniel Naber
Priority: Minor
 Attachments: analyzer-javadoc.diff

I'm sure you find more ways to improve the javadoc, so feel free to change and 
extend my patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-1170) query with AND and OR not retrieving correct results

2008-02-08 Thread Daniel Naber (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12567154#action_12567154
 ] 

Daniel Naber commented on LUCENE-1170:
--

It's a known problem with QueryParser, see e.g. LUCENE-167

 query with AND and OR not retrieving correct results
 

 Key: LUCENE-1170
 URL: https://issues.apache.org/jira/browse/LUCENE-1170
 Project: Lucene - Java
  Issue Type: Bug
  Components: QueryParser
Affects Versions: 2.3
 Environment: linux and windows
Reporter: Graham Maloon

 I was working with Lucene 1.4, and have now upgraded to 2.3.0 but there is 
 still a problem that I am experiencing with the Queryparser
  
 I am passing the following queries:
  
 big brother - works fine
 big brother AND dubai - works fine
 big brother AND football - works fine
 big brother AND dubai OR football - returns extra documents which contain 
 big brother but do not contain either dubai or football.
 big brother AND (dubai OR football) gives the same as the one above  
  
 Am I doing something wrong?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Resolved: (LUCENE-1158) DateTools UTC/GMT mismatch

2008-01-29 Thread Daniel Naber (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Naber resolved LUCENE-1158.
--

   Resolution: Fixed
Fix Version/s: 2.4
Lucene Fields: [New, Patch Available]  (was: [Patch Available, New])

Patch applied.

 DateTools UTC/GMT mismatch
 --

 Key: LUCENE-1158
 URL: https://issues.apache.org/jira/browse/LUCENE-1158
 Project: Lucene - Java
  Issue Type: Bug
  Components: Javadocs
Affects Versions: 2.3
Reporter: Daniel Naber
Priority: Minor
 Fix For: 2.4

 Attachments: datetools.diff


 Post from Antony Bowesman on java-user:
 -
 I just noticed that although the Javadocs for Lucene 2.2 state that the dates 
 for DateTools use UTC as a timezone, they are actually using GMT.
 Should either the Javadocs be corrected or the code corrected to use UTC 
 instead.
 -
 I'm attaching a patch that changes the javadoc and will commit it, unless 
 someone knows a reason the javadoc is correct and the code should be changed 
 to UTC. To my understanding, there's no significant difference between UTC 
 and GMT.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-1158) DateTools UTC/GMT mismatch

2008-01-26 Thread Daniel Naber (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Naber updated LUCENE-1158:
-

Attachment: datetools.diff

 DateTools UTC/GMT mismatch
 --

 Key: LUCENE-1158
 URL: https://issues.apache.org/jira/browse/LUCENE-1158
 Project: Lucene - Java
  Issue Type: Bug
  Components: Javadocs
Affects Versions: 2.3
Reporter: Daniel Naber
Priority: Minor
 Attachments: datetools.diff


 Post from Antony Bowesman on java-user:
 -
 I just noticed that although the Javadocs for Lucene 2.2 state that the dates 
 for DateTools use UTC as a timezone, they are actually using GMT.
 Should either the Javadocs be corrected or the code corrected to use UTC 
 instead.
 -
 I'm attaching a patch that changes the javadoc and will commit it, unless 
 someone knows a reason the javadoc is correct and the code should be changed 
 to UTC. To my understanding, there's no significant difference between UTC 
 and GMT.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Created: (LUCENE-1158) DateTools UTC/GMT mismatch

2008-01-26 Thread Daniel Naber (JIRA)
DateTools UTC/GMT mismatch
--

 Key: LUCENE-1158
 URL: https://issues.apache.org/jira/browse/LUCENE-1158
 Project: Lucene - Java
  Issue Type: Bug
  Components: Javadocs
Affects Versions: 2.3
Reporter: Daniel Naber
Priority: Minor
 Attachments: datetools.diff

Post from Antony Bowesman on java-user:

-

I just noticed that although the Javadocs for Lucene 2.2 state that the dates 
for DateTools use UTC as a timezone, they are actually using GMT.

Should either the Javadocs be corrected or the code corrected to use UTC 
instead.

-

I'm attaching a patch that changes the javadoc and will commit it, unless 
someone knows a reason the javadoc is correct and the code should be changed to 
UTC. To my understanding, there's no significant difference between UTC and GMT.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-1157) Formatable changes log (CHANGES.txt is easy to edit but not so friendly to read by Lucene users)

2008-01-26 Thread Daniel Naber (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12562935#action_12562935
 ] 

Daniel Naber commented on LUCENE-1157:
--

It would be nice to have this working with Javascript disabled, i.e. to have 
all items expanded by default in that case. This could be done by displaying 
all items by default and adding this code at the bottom:

  SCRIPT
for (var i = 0; i  document.getElementsByTagName(ol).length; i++) {
  document.getElementsByTagName(ol)[i].style.display = none;
}
  /SCRIPT

Not very clean, but I don't know a better solution for now.


 Formatable changes log  (CHANGES.txt is easy to edit but not so friendly to 
 read by Lucene users)
 -

 Key: LUCENE-1157
 URL: https://issues.apache.org/jira/browse/LUCENE-1157
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Website
Reporter: Doron Cohen
Assignee: Doron Cohen
 Fix For: 2.4

 Attachments: lucene-1157-take2.patch, lucene-1157.patch


 Background in http://www.nabble.com/formatable-changes-log-tt15078749.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [VOTE] Release Lucene 2.3.0 Take 2

2008-01-22 Thread Daniel Naber
On Dienstag, 22. Januar 2008, Michael Busch wrote:

 just a reminder: this is a NEW vote. We canceled the first vote because
 with LUCENE-1144 an issue came up that is now fixed in the artifacts.

I ran the test cases, indexed a small collection and tried to access it 
with Luke (my system is OpenSuse 10.3 and Java 1.6.0_03). Everything 
worked fine, so:

+1

-- 
http://www.danielnaber.de

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Created: (LUCENE-1144) NPE crash in case of out of memory

2008-01-20 Thread Daniel Naber (JIRA)
NPE crash in case of out of memory
--

 Key: LUCENE-1144
 URL: https://issues.apache.org/jira/browse/LUCENE-1144
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: 2.3
Reporter: Daniel Naber


The attached class makes Lucene crash with an NPE when starting it with 
-Xmx10M, although there's probably an OutOfMemory problem. The stacktrace:

Exception in thread main java.lang.NullPointerException
at java.util.Arrays.fill(Unknown Source)
at 
org.apache.lucene.index.DocumentsWriter$ByteBlockPool.reset(DocumentsWriter.java:2873)
at 
org.apache.lucene.index.DocumentsWriter$ThreadState.resetPostings(DocumentsWriter.java:637)
at 
org.apache.lucene.index.DocumentsWriter.resetPostingsData(DocumentsWriter.java:458)
at 
org.apache.lucene.index.DocumentsWriter.abort(DocumentsWriter.java:423)
at 
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:2433)
at 
org.apache.lucene.index.DocumentsWriter.addDocument(DocumentsWriter.java:2397)
at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1445)
at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1424)
at LuceneCrash.myrun(LuceneCrash.java:32)
at LuceneCrash.main(LuceneCrash.java:19)

The documents are quite big (some hundred KB each), I cannot attach them but I 
can send them via private mail if needed. The crash happens the first time 
reset() is called, after indexing 10 documents. I assume the bug is just that 
the error is misleading, there maybe should be an OOM error.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-1144) NPE crash in case of out of memory

2008-01-20 Thread Daniel Naber (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Naber updated LUCENE-1144:
-

Attachment: LuceneCrash.java

 NPE crash in case of out of memory
 --

 Key: LUCENE-1144
 URL: https://issues.apache.org/jira/browse/LUCENE-1144
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: 2.3
Reporter: Daniel Naber
 Attachments: LuceneCrash.java


 The attached class makes Lucene crash with an NPE when starting it with 
 -Xmx10M, although there's probably an OutOfMemory problem. The stacktrace:
 Exception in thread main java.lang.NullPointerException
   at java.util.Arrays.fill(Unknown Source)
   at 
 org.apache.lucene.index.DocumentsWriter$ByteBlockPool.reset(DocumentsWriter.java:2873)
   at 
 org.apache.lucene.index.DocumentsWriter$ThreadState.resetPostings(DocumentsWriter.java:637)
   at 
 org.apache.lucene.index.DocumentsWriter.resetPostingsData(DocumentsWriter.java:458)
   at 
 org.apache.lucene.index.DocumentsWriter.abort(DocumentsWriter.java:423)
   at 
 org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:2433)
   at 
 org.apache.lucene.index.DocumentsWriter.addDocument(DocumentsWriter.java:2397)
   at 
 org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1445)
   at 
 org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1424)
   at LuceneCrash.myrun(LuceneCrash.java:32)
   at LuceneCrash.main(LuceneCrash.java:19)
 The documents are quite big (some hundred KB each), I cannot attach them but 
 I can send them via private mail if needed. The crash happens the first time 
 reset() is called, after indexing 10 documents. I assume the bug is just that 
 the error is misleading, there maybe should be an OOM error.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-1144) NPE crash in case of out of memory

2008-01-20 Thread Daniel Naber (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12560835#action_12560835
 ] 

Daniel Naber commented on LUCENE-1144:
--

Yes, I get the correct exception now with your patch. Thanks!

Exception in thread main java.lang.OutOfMemoryError: Java heap space
at 
org.apache.lucene.index.DocumentsWriter.recyclePostings(DocumentsWriter.java:3033)
at 
org.apache.lucene.index.DocumentsWriter.access$0(DocumentsWriter.java:3028)
at 
org.apache.lucene.index.DocumentsWriter$ThreadState$FieldData.resetPostingArrays(DocumentsWriter.java:1333)
at 
org.apache.lucene.index.DocumentsWriter$ThreadState.resetPostings(DocumentsWriter.java:644)
at 
org.apache.lucene.index.DocumentsWriter.resetPostingsData(DocumentsWriter.java:458)
at 
org.apache.lucene.index.DocumentsWriter.abort(DocumentsWriter.java:423)
at 
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:2433)
at 
org.apache.lucene.index.DocumentsWriter.addDocument(DocumentsWriter.java:2397)
at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1445)
at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1424)
at LuceneCrash.myrun(LuceneCrash.java:35)
at LuceneCrash.main(LuceneCrash.java:19)

 NPE crash in case of out of memory
 --

 Key: LUCENE-1144
 URL: https://issues.apache.org/jira/browse/LUCENE-1144
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: 2.3
Reporter: Daniel Naber
 Attachments: LUCENE-1144.patch, LuceneCrash.java


 The attached class makes Lucene crash with an NPE when starting it with 
 -Xmx10M, although there's probably an OutOfMemory problem. The stacktrace:
 Exception in thread main java.lang.NullPointerException
   at java.util.Arrays.fill(Unknown Source)
   at 
 org.apache.lucene.index.DocumentsWriter$ByteBlockPool.reset(DocumentsWriter.java:2873)
   at 
 org.apache.lucene.index.DocumentsWriter$ThreadState.resetPostings(DocumentsWriter.java:637)
   at 
 org.apache.lucene.index.DocumentsWriter.resetPostingsData(DocumentsWriter.java:458)
   at 
 org.apache.lucene.index.DocumentsWriter.abort(DocumentsWriter.java:423)
   at 
 org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:2433)
   at 
 org.apache.lucene.index.DocumentsWriter.addDocument(DocumentsWriter.java:2397)
   at 
 org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1445)
   at 
 org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1424)
   at LuceneCrash.myrun(LuceneCrash.java:32)
   at LuceneCrash.main(LuceneCrash.java:19)
 The documents are quite big (some hundred KB each), I cannot attach them but 
 I can send them via private mail if needed. The crash happens the first time 
 reset() is called, after indexing 10 documents. I assume the bug is just that 
 the error is misleading, there maybe should be an OOM error.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Closed: (LUCENE-1113) fix for Document.getBoost() documentation

2008-01-01 Thread Daniel Naber (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Naber closed LUCENE-1113.


   Resolution: Fixed
Fix Version/s: 2.3
Lucene Fields: [New, Patch Available]  (was: [Patch Available, New])

Thanks, I've committed your text.

 fix for Document.getBoost() documentation
 -

 Key: LUCENE-1113
 URL: https://issues.apache.org/jira/browse/LUCENE-1113
 Project: Lucene - Java
  Issue Type: Bug
  Components: Javadocs
Affects Versions: 2.2
Reporter: Daniel Naber
Priority: Minor
 Fix For: 2.3

 Attachments: document-getboost.diff


 The attached patch fixes the javadoc to make clear that getBoost() will never 
 return a useful value in most cases. I will commit this unless someone has a 
 better wording or a real fix.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Created: (LUCENE-1113) fix for Document.getBoost() documentation

2007-12-31 Thread Daniel Naber (JIRA)
fix for Document.getBoost() documentation
-

 Key: LUCENE-1113
 URL: https://issues.apache.org/jira/browse/LUCENE-1113
 Project: Lucene - Java
  Issue Type: Bug
  Components: Javadocs
Affects Versions: 2.2
Reporter: Daniel Naber
Priority: Minor
 Attachments: document-getboost.diff

The attached patch fixes the javadoc to make clear that getBoost() will never 
return a useful value in most cases. I will commit this unless someone has a 
better wording or a real fix.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-1113) fix for Document.getBoost() documentation

2007-12-31 Thread Daniel Naber (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Naber updated LUCENE-1113:
-

Attachment: document-getboost.diff

 fix for Document.getBoost() documentation
 -

 Key: LUCENE-1113
 URL: https://issues.apache.org/jira/browse/LUCENE-1113
 Project: Lucene - Java
  Issue Type: Bug
  Components: Javadocs
Affects Versions: 2.2
Reporter: Daniel Naber
Priority: Minor
 Attachments: document-getboost.diff


 The attached patch fixes the javadoc to make clear that getBoost() will never 
 return a useful value in most cases. I will commit this unless someone has a 
 better wording or a real fix.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-770) CfsExtractor tool

2007-12-26 Thread Daniel Naber (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12554405
 ] 

Daniel Naber commented on LUCENE-770:
-

Otis, I've used it just once and noticed the problem. I'm not sure how to fix 
this problem, I could of course just change the javadoc. But telling people to 
use a hex editor to change some files isn't really a nice solution.

 CfsExtractor tool
 -

 Key: LUCENE-770
 URL: https://issues.apache.org/jira/browse/LUCENE-770
 Project: Lucene - Java
  Issue Type: New Feature
  Components: Index
Affects Versions: 2.1
Reporter: Otis Gospodnetic
Priority: Minor
 Attachments: LUCENE-770.patch


 A tool for extracting the content of a CFS file, in order to go from a 
 compound index to a multi-file index.
 This may be handy for people who want to go back to multi-file index format 
 now that field norms are in a single file - LUCENE-756.
 Most of this code already existed and was hiding in IndexReader.main.
 I'll commit tomorrow, unless I hear otherwise.  I think I should also remove 
 IndexReader.main then.  Ja?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [Lucene-java Wiki] Update of PoweredBy by PietSchmidt

2007-12-22 Thread Daniel Naber
On Dienstag, 18. Dezember 2007, Chris Hostetter wrote:

 not every one is allowed to link back to Lucene ... but i have been
 thinking that we could start making it a policy that if you want to put
 a link to your site on the wiki, you need to have two URLs: a URL
 showing Lucene in use, and a URL where you talk about the code you
 implemented or the hardware you run on (which can easily be a blog post
 or mailing list archive link) ... that would hopefully weed out some of
 the fly by linkers

I'm now adding a text to the PoweredBy Wiki page, feel free to adapt it.

Regards
 Daniel

-- 
http://www.danielnaber.de

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-770) CfsExtractor tool

2007-12-20 Thread Daniel Naber (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12553778
 ] 

Daniel Naber commented on LUCENE-770:
-

I think there's a small issue which is also in IndexReader.main: the javadoc 
claims that you need to copy the segments files to make the extracted index 
work, but that's not enough, you will also need to modify the segments file 
because it contains the information whether the index is in compound format or 
not.

 CfsExtractor tool
 -

 Key: LUCENE-770
 URL: https://issues.apache.org/jira/browse/LUCENE-770
 Project: Lucene - Java
  Issue Type: New Feature
  Components: Index
Affects Versions: 2.1
Reporter: Otis Gospodnetic
Priority: Minor
 Attachments: LUCENE-770.patch


 A tool for extracting the content of a CFS file, in order to go from a 
 compound index to a multi-file index.
 This may be handy for people who want to go back to multi-file index format 
 now that field norms are in a single file - LUCENE-756.
 Most of this code already existed and was hiding in IndexReader.main.
 I'll commit tomorrow, unless I hear otherwise.  I think I should also remove 
 IndexReader.main then.  Ja?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [Lucene-java Wiki] Update of PoweredBy by PietSchmidt

2007-12-17 Thread Daniel Naber
On Montag, 17. Dezember 2007, Apache Wiki wrote:

 +  * [http://frauen-kennenlernen.com/ Frauen kennenlernen] - Search
 engine using Lucene

I don't claim that this is spam, but more and more of the Wiki PoweredBy 
links look like someone just wants a link from the Lucene project, 
probably to boost their Google ranking. We cannot tell whether these 
people really use Lucene at all, or if they use some blogging software 
which in turn uses Lucene (in that case it wouldn't make sense to link 
them from our page either).

My suggestion would be that we only accept links if people use Lucene 
directly (not via a software that has a Lucene-based search anyway) and 
that they put a link to Lucene on their imprint/contact page or on the 
search result page. On the other hand, while the page above is harmless, I 
guess it's not necessarily something Apache Lucene needs to be associated 
with.

Any suggestions?

Regards
 Daniel

-- 
http://www.danielnaber.de

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Performance Improvement for Search using PriorityQueue

2007-12-10 Thread Daniel Naber
On Montag, 10. Dezember 2007, Michael Busch wrote:

 Reboot your machine ;-) That's what I usually do - if there's another
 way I'd like to know as well!

On Linux (kernel 2.6.16 and later), call:

sync ; echo 3  /proc/sys/vm/drop_caches

Regards
 Daniel

-- 
http://www.danielnaber.de

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Created: (LUCENE-1084) increase default maxFieldLength?

2007-12-07 Thread Daniel Naber (JIRA)
increase default maxFieldLength?


 Key: LUCENE-1084
 URL: https://issues.apache.org/jira/browse/LUCENE-1084
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 2.2
Reporter: Daniel Naber


To my understanding, Lucene 2.3 will easily index large documents. So shouldn't 
we get rid of the 10,000 default limit for the field length? 10,000 isn't that 
much and as Lucene doesn't have any error logging by default, this is a common 
problem for users that is difficult to debug if you don't know where to look.

A better new default might be Integer.MAX_VALUE.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-588) Escaped wildcard character in wildcard term not handled correctly

2007-11-28 Thread Daniel Naber (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12546478
 ] 

Daniel Naber commented on LUCENE-588:
-

The problem is that the WildcardQuery itself doesn't have a concept of escaped 
characters. The escape characters are removed in QueryParser. This mean t?\?t 
will arrive as t??t in WildcardQuery and the second question mark is also 
interpreted as a wildcard.


 Escaped wildcard character in wildcard term not handled correctly
 -

 Key: LUCENE-588
 URL: https://issues.apache.org/jira/browse/LUCENE-588
 Project: Lucene - Java
  Issue Type: Bug
  Components: QueryParser
Affects Versions: 2.0.0
 Environment: Windows XP SP2
Reporter: Sunil Kamath

 If an escaped wildcard character is specified in a wildcard query, it is 
 treated as a wildcard instead of a literal.
 e.g., t\??t is converted by the QueryParser to t??t - the escape character is 
 discarded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Issue Comment Edited: (LUCENE-588) Escaped wildcard character in wildcard term not handled correctly

2007-11-28 Thread Daniel Naber (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12546478
 ] 

[EMAIL PROTECTED] edited comment on LUCENE-588 at 11/28/07 3:27 PM:
---

The problem is that the WildcardQuery itself doesn't have a concept of escaped 
characters. The escape characters are removed in QueryParser. This mean 
t?\\?t will arrive as t??t in WildcardQuery and the second question mark is 
also interpreted as a wildcard.


  was (Author: [EMAIL PROTECTED]):
The problem is that the WildcardQuery itself doesn't have a concept of 
escaped characters. The escape characters are removed in QueryParser. This mean 
t?\?t will arrive as t??t in WildcardQuery and the second question mark is 
also interpreted as a wildcard.

  
 Escaped wildcard character in wildcard term not handled correctly
 -

 Key: LUCENE-588
 URL: https://issues.apache.org/jira/browse/LUCENE-588
 Project: Lucene - Java
  Issue Type: Bug
  Components: QueryParser
Affects Versions: 2.0.0
 Environment: Windows XP SP2
Reporter: Sunil Kamath

 If an escaped wildcard character is specified in a wildcard query, it is 
 treated as a wildcard instead of a literal.
 e.g., t\??t is converted by the QueryParser to t??t - the escape character is 
 discarded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-588) Escaped wildcard character in wildcard term not handled correctly

2007-11-28 Thread Daniel Naber (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12546479
 ] 

Daniel Naber commented on LUCENE-588:
-

Also, the original report and my comment look confusing because Jira removes 
the backslash. Imagine a backslash in front of *one* of the question marks.

 Escaped wildcard character in wildcard term not handled correctly
 -

 Key: LUCENE-588
 URL: https://issues.apache.org/jira/browse/LUCENE-588
 Project: Lucene - Java
  Issue Type: Bug
  Components: QueryParser
Affects Versions: 2.0.0
 Environment: Windows XP SP2
Reporter: Sunil Kamath

 If an escaped wildcard character is specified in a wildcard query, it is 
 treated as a wildcard instead of a literal.
 e.g., t\??t is converted by the QueryParser to t??t - the escape character is 
 discarded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Issue Comment Edited: (LUCENE-588) Escaped wildcard character in wildcard term not handled correctly

2007-11-28 Thread Daniel Naber (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12546478
 ] 

[EMAIL PROTECTED] edited comment on LUCENE-588 at 11/28/07 3:27 PM:
---

The problem is that the WildcardQuery itself doesn't have a concept of escaped 
characters. The escape characters are removed in QueryParser. This mean t?\?t 
will arrive as t??t in WildcardQuery and the second question mark is also 
interpreted as a wildcard.


  was (Author: [EMAIL PROTECTED]):
The problem is that the WildcardQuery itself doesn't have a concept of 
escaped characters. The escape characters are removed in QueryParser. This mean 
t?\\?t will arrive as t??t in WildcardQuery and the second question mark is 
also interpreted as a wildcard.

  
 Escaped wildcard character in wildcard term not handled correctly
 -

 Key: LUCENE-588
 URL: https://issues.apache.org/jira/browse/LUCENE-588
 Project: Lucene - Java
  Issue Type: Bug
  Components: QueryParser
Affects Versions: 2.0.0
 Environment: Windows XP SP2
Reporter: Sunil Kamath

 If an escaped wildcard character is specified in a wildcard query, it is 
 treated as a wildcard instead of a literal.
 e.g., t\??t is converted by the QueryParser to t??t - the escape character is 
 discarded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Closed: (LUCENE-1045) SortField.AUTO doesn't work with long

2007-11-26 Thread Daniel Naber (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Naber closed LUCENE-1045.


   Resolution: Fixed
Fix Version/s: 2.3
Lucene Fields: [New, Patch Available]  (was: [Patch Available, New])

patch applied


 SortField.AUTO doesn't work with long
 -

 Key: LUCENE-1045
 URL: https://issues.apache.org/jira/browse/LUCENE-1045
 Project: Lucene - Java
  Issue Type: Bug
  Components: Search
Affects Versions: 2.2
Reporter: Daniel Naber
Priority: Minor
 Fix For: 2.3

 Attachments: auto-long-sorting.diff, TestDateSort.java


 This is actually the same as LUCENE-463 but I cannot find a way to re-open 
 that issue. I'm attaching a test case by dragon-fly999 at hotmail com that 
 shows the problem and a patch that seems to fix it.
 The problem is that a long (as used for dates) cannot be parsed as an 
 integer, and the next step is then to parse it as a float, which works but 
 which is not correct. With the patch the following parsers are used in this 
 order: int, long, float.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-1046) Dead code in SpellChecker.java (branch never executes)

2007-11-26 Thread Daniel Naber (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Naber updated LUCENE-1046:
-

Attachment: LUCENE-1046.diff

Thanks for your report, could you try out this patch?

 Dead code in SpellChecker.java (branch never executes)
 --

 Key: LUCENE-1046
 URL: https://issues.apache.org/jira/browse/LUCENE-1046
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/*
Affects Versions: 2.2
Reporter: Joe
Priority: Minor
 Attachments: LUCENE-1046.diff


 SpellChecker contains the following lines of code:
 final int goalFreq = (morePopular  ir != null) ? ir.docFreq(new 
 Term(field, word)) : 0;
 // if the word exists in the real index and we don't care for word 
 frequency, return the word itself
 if (!morePopular  goalFreq  0) {
   return new String[] { word };
 }
 The branch will never execute: the only way for goalFreq to be greater than 
 zero is if morePopular is true, but if morePopular is true, the expression in 
 the if statement evaluates to false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-920) IndexModifier has incomplete Javadocs

2007-11-26 Thread Daniel Naber (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12545634
 ] 

Daniel Naber commented on LUCENE-920:
-

I think this bug can be closed, as IndexModifier is deprecated.

 IndexModifier has incomplete Javadocs
 -

 Key: LUCENE-920
 URL: https://issues.apache.org/jira/browse/LUCENE-920
 Project: Lucene - Java
  Issue Type: Wish
  Components: Javadocs
Reporter: Michael Busch
Priority: Trivial

 A lot of public and protected members of 
 org.apache.lucene.index.IndexModifier 
 don't have javadocs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-1066) better explain output

2007-11-25 Thread Daniel Naber (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Naber updated LUCENE-1066:
-

Attachment: explain-output.diff

 better explain output
 -

 Key: LUCENE-1066
 URL: https://issues.apache.org/jira/browse/LUCENE-1066
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Query/Scoring
Affects Versions: 2.3
Reporter: Daniel Naber
 Attachments: explain-output.diff


 Very simple patch that slightly improves output of idf: show both docFreq and 
 numDocs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Closed: (LUCENE-1066) better explain output

2007-11-25 Thread Daniel Naber (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Naber closed LUCENE-1066.


   Resolution: Fixed
Fix Version/s: 2.3
Lucene Fields: [New, Patch Available]  (was: [Patch Available, New])

applied

 better explain output
 -

 Key: LUCENE-1066
 URL: https://issues.apache.org/jira/browse/LUCENE-1066
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Query/Scoring
Affects Versions: 2.3
Reporter: Daniel Naber
Priority: Trivial
 Fix For: 2.3

 Attachments: explain-output.diff


 Very simple patch that slightly improves output of idf: show both docFreq and 
 numDocs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-1066) better explain output

2007-11-25 Thread Daniel Naber (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Naber updated LUCENE-1066:
-

 Priority: Trivial  (was: Major)
Lucene Fields: [New, Patch Available]  (was: [Patch Available, New])

 better explain output
 -

 Key: LUCENE-1066
 URL: https://issues.apache.org/jira/browse/LUCENE-1066
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Query/Scoring
Affects Versions: 2.3
Reporter: Daniel Naber
Priority: Trivial
 Fix For: 2.3

 Attachments: explain-output.diff


 Very simple patch that slightly improves output of idf: show both docFreq and 
 numDocs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-997) Add search timeout support to Lucene

2007-09-14 Thread Daniel Naber (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12527605
 ] 

Daniel Naber commented on LUCENE-997:
-

Thanks for the patch. I didn't have a very close look, just one small thing: 
it's probably no good idea to catch and ignore the InterruptedException. See 
http://www-128.ibm.com/developerworks/java/library/j-jtp05236.html

 Add search timeout support to Lucene
 

 Key: LUCENE-997
 URL: https://issues.apache.org/jira/browse/LUCENE-997
 Project: Lucene - Java
  Issue Type: New Feature
Reporter: Sean Timm
Priority: Minor
 Attachments: LuceneTimeoutTest.java, timeout.patch


 This patch is based on Nutch-308. 
 This patch adds support for a maximum search time limit. After this time is 
 exceeded, the search thread is stopped, partial results (if any) are returned 
 and the total number of results is estimated.
 This patch tries to minimize the overhead related to time-keeping by using a 
 version of safe unsynchronized timer.
 This was also discussed in an e-mail thread.
 http://www.nabble.com/search-timeout-tf3410206.html#a9501029

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Lucene 2.2.0 release available

2007-06-20 Thread Daniel Naber
On Wednesday 20 June 2007 03:01, Yonik Seeley wrote:

  FYI, The announcement has not made it to the http://
  lucene.apache.org/ page.

 I just committed this.  It should be viewable in about an hour.

The links to the new features don't work for me, I always end up on the API 
overview page. Shouldn't the links be e.g.

http://lucene.apache.org/java/2_2_0/api/org/apache/lucene/document/Field.html

instead of

http://lucene.apache.org/java/2_2_0/api/index.html?org/apache/lucene/document/Field.html
?

Regards
 Daniel

-- 
http://www.danielnaber.de

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-759) Add n-gram tokenizers to contrib/analyzers

2007-06-01 Thread Daniel Naber (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500811
 ] 

Daniel Naber commented on LUCENE-759:
-

Can this issue be closed or is there anything still open?


 Add n-gram tokenizers to contrib/analyzers
 --

 Key: LUCENE-759
 URL: https://issues.apache.org/jira/browse/LUCENE-759
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Analysis
Reporter: Otis Gospodnetic
Assignee: Otis Gospodnetic
Priority: Minor
 Fix For: 2.2

 Attachments: LUCENE-759-filters.patch, LUCENE-759.patch, 
 LUCENE-759.patch, LUCENE-759.patch


 It would be nice to have some n-gram-capable tokenizers in contrib/analyzers. 
  Patch coming shortly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-763) LuceneDictionary skips first word in enumeration

2007-06-01 Thread Daniel Naber (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500863
 ] 

Daniel Naber commented on LUCENE-763:
-

Thanks, Steven. Your javadoc changes have also been committed now.


 LuceneDictionary skips first word in enumeration
 

 Key: LUCENE-763
 URL: https://issues.apache.org/jira/browse/LUCENE-763
 Project: Lucene - Java
  Issue Type: Bug
  Components: Other
Affects Versions: 2.0.0
 Environment: Windows Sun JRE 1.4.2_10_b03
Reporter: Dan Ertman
 Fix For: 2.2

 Attachments: LuceneDictionary.java, TestLuceneDictionary.java


 The current code for LuceneDictionary will always skip the first word of the 
 TermEnum. The reason is that it doesn't initially retrieve TermEnum.term - 
 its first call is to TermEnum.next, which moves it past the first term (line 
 76).
 To see this problem cause a failure, add this test to TestSpellChecker:
 similar = spellChecker.suggestSimilar(eihgt,2);
   assertEquals(1, similar.length);
   assertEquals(similar[0], eight);
 Because eight is the first word in the index, it will fail.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Closed: (LUCENE-763) LuceneDictionary skips first word in enumeration

2007-05-31 Thread Daniel Naber (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Naber closed LUCENE-763.
---

   Resolution: Fixed
Fix Version/s: 2.2

Thanks, patch applied.


 LuceneDictionary skips first word in enumeration
 

 Key: LUCENE-763
 URL: https://issues.apache.org/jira/browse/LUCENE-763
 Project: Lucene - Java
  Issue Type: Bug
  Components: Other
Affects Versions: 2.0.0
 Environment: Windows Sun JRE 1.4.2_10_b03
Reporter: Dan Ertman
 Fix For: 2.2

 Attachments: LuceneDictionary.java, TestLuceneDictionary.java


 The current code for LuceneDictionary will always skip the first word of the 
 TermEnum. The reason is that it doesn't initially retrieve TermEnum.term - 
 its first call is to TermEnum.next, which moves it past the first term (line 
 76).
 To see this problem cause a failure, add this test to TestSpellChecker:
 similar = spellChecker.suggestSimilar(eihgt,2);
   assertEquals(1, similar.length);
   assertEquals(similar[0], eight);
 Because eight is the first word in the index, it will fail.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-763) LuceneDictionary skips first word in enumeration

2007-05-30 Thread Daniel Naber (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500230
 ] 

Daniel Naber commented on LUCENE-763:
-

Thanks for your patch. I think there's a problem with the iterator which might 
not occur often, but it should be fixed nonetheless: calling next() only has an 
effect if hasNext() has been called before. You can see that by commenting out 
assertTrue(Second element doesn't exist., it.hasNext()); in the test case: 
the test will then fail, although, to my understanding, hasNext() should have 
no side effects. Could you change you patch accordingly?


 LuceneDictionary skips first word in enumeration
 

 Key: LUCENE-763
 URL: https://issues.apache.org/jira/browse/LUCENE-763
 Project: Lucene - Java
  Issue Type: Bug
  Components: Other
Affects Versions: 2.0.0
 Environment: Windows Sun JRE 1.4.2_10_b03
Reporter: Dan Ertman
 Attachments: LuceneDictionary.java, TestLuceneDictionary.java


 The current code for LuceneDictionary will always skip the first word of the 
 TermEnum. The reason is that it doesn't initially retrieve TermEnum.term - 
 its first call is to TermEnum.next, which moves it past the first term (line 
 76).
 To see this problem cause a failure, add this test to TestSpellChecker:
 similar = spellChecker.suggestSimilar(eihgt,2);
   assertEquals(1, similar.length);
   assertEquals(similar[0], eight);
 Because eight is the first word in the index, it will fail.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Closed: (LUCENE-886) spellchecker cleanup

2007-05-26 Thread Daniel Naber (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Naber closed LUCENE-886.
---

   Resolution: Fixed
Fix Version/s: 2.2
Lucene Fields: [New, Patch Available]  (was: [Patch Available, New])

committed.

 spellchecker cleanup
 

 Key: LUCENE-886
 URL: https://issues.apache.org/jira/browse/LUCENE-886
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 2.1
Reporter: Daniel Naber
 Fix For: 2.2

 Attachments: spellchecker-cleanup.diff


 Some cleanup, attached here so it can be tracked if necessary: javadoc 
 improvements; don't print exceptions to stderr but re-throw them; new 
 constructor for a new test case. I will commit this soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Closed: (LUCENE-882) Spellchecker doesn't need to store ngrams

2007-05-19 Thread Daniel Naber (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Naber closed LUCENE-882.
---

   Resolution: Fixed
Lucene Fields: [New, Patch Available]  (was: [Patch Available, New])

patch applied

 Spellchecker doesn't need to store ngrams
 -

 Key: LUCENE-882
 URL: https://issues.apache.org/jira/browse/LUCENE-882
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Other
Affects Versions: 2.1
Reporter: Daniel Naber
 Attachments: lucene-spellchecker.diff


 The spellchecker in contrib stores the ngrams although this doesn't seem to 
 be necessary. This patch changes that, I will commit it unless someone 
 objects. This improves indexing speed and index size. Some numbers on a small 
 test I did:
 Input of the original index: 2200 text files, index size 5.3 MB, indexing 
 took 17 seconds
 Spell index before patch: about 60.000 documents, index size 13 MB, indexing 
 took 62 seconds
 Spell index after patch: about 60.000 documents, index size 6.3 MB, indexing 
 took 52 seconds
 BTW, the test case fails even before this patch. I'll probaby submit another 
 issue about how to fix that.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Closed: (LUCENE-883) make spell checker test case work again

2007-05-19 Thread Daniel Naber (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Naber closed LUCENE-883.
---

   Resolution: Fixed
Fix Version/s: 2.2
Lucene Fields: [New, Patch Available]  (was: [Patch Available, New])

Patch applied.

 make spell checker test case work again
 ---

 Key: LUCENE-883
 URL: https://issues.apache.org/jira/browse/LUCENE-883
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 2.1
Reporter: Daniel Naber
 Fix For: 2.2

 Attachments: lucene-spellchecker-2.diff


 See attached path which makes the spellchecker test case work again. The 
 problem without the patch is that consecutive calls to indexDictionary() will 
 create a spelling index with duplicate words. Does anybody see a problem with 
 this patch? I see that the spellchecker code is now used in Solr, isn't it? I 
 didn't have time to test this patch inside Solr.
 Also see http://issues.apache.org/jira/browse/LUCENE-632, but the null check 
 is included in this patch so the NPE described there cannot happen anymore.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-403) Alternate Lucene Query Highlighter

2007-05-19 Thread Daniel Naber (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Naber updated LUCENE-403:


Assignee: (was: Lucene Developers)
 Summary: Alternate Lucene Query Highlighter  (was: Alternate Lucene Query 
Parser)

fix title

 Alternate Lucene Query Highlighter
 --

 Key: LUCENE-403
 URL: https://issues.apache.org/jira/browse/LUCENE-403
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Other
Affects Versions: 1.4
 Environment: Operating System: All
 Platform: All
Reporter: David Bohl
Priority: Minor
 Attachments: HighlighterTest.java, HighlighterTest.java, 
 QueryHighlighter.java, QueryHighlighter.java, QueryHighlighter.java, 
 QuerySpansExtractor.java


 I created a lucene query highlighter (borrowing some code from the one in
 the sandbox) that my company is using.  It better handles phrase queries,
 doesn't break HTML entities, and has the ability to either highlight terms
 in an entire document or to highlight fragments from the document.  I would 
 like to make it available to anyone who wants it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Created: (LUCENE-886) spellchecker cleanup

2007-05-19 Thread Daniel Naber (JIRA)
spellchecker cleanup


 Key: LUCENE-886
 URL: https://issues.apache.org/jira/browse/LUCENE-886
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 2.1
Reporter: Daniel Naber
 Attachments: spellchecker-cleanup.diff

Some cleanup, attached here so it can be tracked if necessary: javadoc 
improvements; don't print exceptions to stderr but re-throw them; new 
constructor for a new test case. I will commit this soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-886) spellchecker cleanup

2007-05-19 Thread Daniel Naber (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Naber updated LUCENE-886:


Attachment: spellchecker-cleanup.diff

cleanup patch

 spellchecker cleanup
 

 Key: LUCENE-886
 URL: https://issues.apache.org/jira/browse/LUCENE-886
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 2.1
Reporter: Daniel Naber
 Attachments: spellchecker-cleanup.diff


 Some cleanup, attached here so it can be tracked if necessary: javadoc 
 improvements; don't print exceptions to stderr but re-throw them; new 
 constructor for a new test case. I will commit this soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-883) make spell checker test case work again

2007-05-18 Thread Daniel Naber (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12497014
 ] 

Daniel Naber commented on LUCENE-883:
-

Yes, the exist() method checks whether the reader is null and re-opens it if 
necessary, so reader = null is needed.

 make spell checker test case work again
 ---

 Key: LUCENE-883
 URL: https://issues.apache.org/jira/browse/LUCENE-883
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 2.1
Reporter: Daniel Naber
 Attachments: lucene-spellchecker-2.diff


 See attached path which makes the spellchecker test case work again. The 
 problem without the patch is that consecutive calls to indexDictionary() will 
 create a spelling index with duplicate words. Does anybody see a problem with 
 this patch? I see that the spellchecker code is now used in Solr, isn't it? I 
 didn't have time to test this patch inside Solr.
 Also see http://issues.apache.org/jira/browse/LUCENE-632, but the null check 
 is included in this patch so the NPE described there cannot happen anymore.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Closed: (LUCENE-884) Query Syntax page does not make it clear that wildcard searches are not allowed in Phrase Queries

2007-05-18 Thread Daniel Naber (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Naber closed LUCENE-884.
---

Resolution: Fixed

Thanks, this is fixed now (website should update soon).

 Query Syntax page does not make it clear that wildcard searches are not 
 allowed in Phrase Queries
 -

 Key: LUCENE-884
 URL: https://issues.apache.org/jira/browse/LUCENE-884
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Website
Affects Versions: 2.0.1
Reporter: Paul Taylor

 The queryparsersyntax page which is where I expect most novices (such as 
 myself) start with lucene seems to indicate that wildcards can be used in 
 phrase terms
 Quoting:
 'Terms: A query is broken up into terms and operators. There are two types of 
 terms: Single Terms and Phrases.
 A Single Term is a single word such as test or hello.
 A Phrase is a group of words surrounded by double quotes such as hello 
 dolly.
 
 Wildcard Searches
 Lucene supports single and multiple character wildcard searches.
 To perform a multiple character wildcard search use the * symbol.
 Multiple character wildcard searches looks for 0 or more characters. For 
 example, to search for test, tests or tester, you can use the search:
 test*
 You can also use the wildcard searches in the middle of a term.
 '
 there is nothing to indicate in the section on Wildcard Searches that it can 
 be performed only on Single word terms not Phrase terms.
 Chris  argues 'that there is nothing in the description of a Phrase to 
 indicate that it can be anything other then what it says a group of words 
 surrounded by double quotes .. at no point does it
 suggest that other types of queries or syntax can be used inside the quotes.  
 likewise the discussion of Wildcards makes no mention of phrases to suggest 
 that wildcard characters can be used in a phrase.'
 but I don't accept this because there is nothing in the description of a 
 Single Term either to indicate it can use wildcards either. Wildcards are 
 only mentioned in the Wildcard section and there it says thay can be used in 
 a term, it does not restrict the type of term
 I Propose a simple solution modify:
 Lucene supports single and multiple character wildcard searches.
 to 
 Lucene supports single and multiple character wildcard searches within single 
 terms.
 (Chris asked for a patch, but Im not sure how to do this, but the change is 
 simple enough)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Recreating a document from its index

2007-05-17 Thread Daniel Naber
On Thursday 17 May 2007 00:58, Stefano Fornari wrote:

 I have a question to which I could not answer reading the
 documentation and searching the mailing list archive:

This actually belongs more to the user list...  try Luke and click on the 
Reconstruct  Edit button, then on the Tokenized tab. This will show 
you what can be recreated. This depends on the stopwords and the other 
normalizations made by the Analyzer.

Regards
 Daniel

-- 
http://www.danielnaber.de

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Created: (LUCENE-882) Spellchecker doesn't need to store ngrams

2007-05-17 Thread Daniel Naber (JIRA)
Spellchecker doesn't need to store ngrams
-

 Key: LUCENE-882
 URL: https://issues.apache.org/jira/browse/LUCENE-882
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Other
Affects Versions: 2.1
Reporter: Daniel Naber
 Attachments: lucene-spellchecker.diff

The spellchecker in contrib stores the ngrams although this doesn't seem to be 
necessary. This patch changes that, I will commit it unless someone objects. 
This improves indexing speed and index size. Some numbers on a small test I did:

Input of the original index: 2200 text files, index size 5.3 MB, indexing took 
17 seconds

Spell index before patch: about 60.000 documents, index size 13 MB, indexing 
took 62 seconds
Spell index after patch: about 60.000 documents, index size 6.3 MB, indexing 
took 52 seconds

BTW, the test case fails even before this patch. I'll probaby submit another 
issue about how to fix that.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-882) Spellchecker doesn't need to store ngrams

2007-05-17 Thread Daniel Naber (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Naber updated LUCENE-882:


Attachment: lucene-spellchecker.diff

don't store but only index ngrams

 Spellchecker doesn't need to store ngrams
 -

 Key: LUCENE-882
 URL: https://issues.apache.org/jira/browse/LUCENE-882
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Other
Affects Versions: 2.1
Reporter: Daniel Naber
 Attachments: lucene-spellchecker.diff


 The spellchecker in contrib stores the ngrams although this doesn't seem to 
 be necessary. This patch changes that, I will commit it unless someone 
 objects. This improves indexing speed and index size. Some numbers on a small 
 test I did:
 Input of the original index: 2200 text files, index size 5.3 MB, indexing 
 took 17 seconds
 Spell index before patch: about 60.000 documents, index size 13 MB, indexing 
 took 62 seconds
 Spell index after patch: about 60.000 documents, index size 6.3 MB, indexing 
 took 52 seconds
 BTW, the test case fails even before this patch. I'll probaby submit another 
 issue about how to fix that.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Created: (LUCENE-883) make spell checker test case work again

2007-05-17 Thread Daniel Naber (JIRA)
make spell checker test case work again
---

 Key: LUCENE-883
 URL: https://issues.apache.org/jira/browse/LUCENE-883
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 2.1
Reporter: Daniel Naber
 Attachments: lucene-spellchecker-2.diff

See attached path which makes the spellchecker test case work again. The 
problem without the patch is that consecutive calls to indexDictionary() will 
create a spelling index with duplicate words. Does anybody see a problem with 
this patch? I see that the spellchecker code is now used in Solr, isn't it? I 
didn't have time to test this patch inside Solr.

Also see http://issues.apache.org/jira/browse/LUCENE-632, but the null check is 
included in this patch so the NPE described there cannot happen anymore.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-883) make spell checker test case work again

2007-05-17 Thread Daniel Naber (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Naber updated LUCENE-883:


Attachment: lucene-spellchecker-2.diff

patch to make test case work again

 make spell checker test case work again
 ---

 Key: LUCENE-883
 URL: https://issues.apache.org/jira/browse/LUCENE-883
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 2.1
Reporter: Daniel Naber
 Attachments: lucene-spellchecker-2.diff


 See attached path which makes the spellchecker test case work again. The 
 problem without the patch is that consecutive calls to indexDictionary() will 
 create a spelling index with duplicate words. Does anybody see a problem with 
 this patch? I see that the spellchecker code is now used in Solr, isn't it? I 
 didn't have time to test this patch inside Solr.
 Also see http://issues.apache.org/jira/browse/LUCENE-632, but the null check 
 is included in this patch so the NPE described there cannot happen anymore.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Closed: (LUCENE-523) FSDirectory.openFile(String) causes ClassCastException

2007-05-12 Thread Daniel Naber (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Naber closed LUCENE-523.
---

Resolution: Fixed

openFile had been deprecated in Lucene 1.9 and then later removed, so I'm 
closing this issue.

 FSDirectory.openFile(String) causes ClassCastException
 --

 Key: LUCENE-523
 URL: https://issues.apache.org/jira/browse/LUCENE-523
 Project: Lucene - Java
  Issue Type: Bug
  Components: Store
Affects Versions: 1.9, 2.0.0
 Environment: Lucene 1.9.1
Reporter: Eric Isakson

 When you call FSDirectory.openFile(String) you get a ClassCastException since 
 FSIndexInput is not an org.apache.lucene.store.InputStream
 The workaround is to reimplement using openInput(String). I personally don't 
 need this to be fixed but wanted to document it here in case anyone else runs 
 into this for any reason.
 The reason I'm calling this is that I have a requirement on my project to 
 create read only indexes and name the index segments consistently from one 
 build to the next. So, after creating and optimizing the index, I rename the 
 files and rewrite the segments file. It would be nice if I had an API that 
 would allow me to say I only want one segment and I want its name to be 
 'foo'. For instance IndexWriter.optimize(String segmentName)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-523) FSDirectory.openFile(String) causes ClassCastException

2007-05-11 Thread Daniel Naber (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495163
 ] 

Daniel Naber commented on LUCENE-523:
-

The issue at Jackrabbit is closed, so I guess this can be closed too? I'll do 
so unless someone objects.

 FSDirectory.openFile(String) causes ClassCastException
 --

 Key: LUCENE-523
 URL: https://issues.apache.org/jira/browse/LUCENE-523
 Project: Lucene - Java
  Issue Type: Bug
  Components: Store
Affects Versions: 1.9, 2.0.0
 Environment: Lucene 1.9.1
Reporter: Eric Isakson

 When you call FSDirectory.openFile(String) you get a ClassCastException since 
 FSIndexInput is not an org.apache.lucene.store.InputStream
 The workaround is to reimplement using openInput(String). I personally don't 
 need this to be fixed but wanted to document it here in case anyone else runs 
 into this for any reason.
 The reason I'm calling this is that I have a requirement on my project to 
 create read only indexes and name the index segments consistently from one 
 build to the next. So, after creating and optimizing the index, I rename the 
 files and rewrite the segments file. It would be nice if I had an API that 
 would allow me to say I only want one segment and I want its name to be 
 'foo'. For instance IndexWriter.optimize(String segmentName)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Created: (LUCENE-858) link from Lucene web page to API docs

2007-04-09 Thread Daniel Naber (JIRA)
link from Lucene web page to API docs
-

 Key: LUCENE-858
 URL: https://issues.apache.org/jira/browse/LUCENE-858
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Daniel Naber
 Assigned To: Grant Ingersoll


There should be a way to link from e.g. 
http://lucene.apache.org/java/docs/gettingstarted.html to the API docs, but not 
just to the start page with the frame set but to a specific page, e.g. this:

http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/overview-summary.html#overview_description

To make this work a way to set a relative link is needed.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: linking the API docs

2007-04-07 Thread Daniel Naber
On Saturday 07 April 2007 00:42, Chris Hostetter wrote:

 : I think you can put in the link, just use relative link like in the
 : site.xml.

 using a relative link is *key* ... it ensures not only that the static
 files build by the nightly build work, but also that the docs
 distributed with each release contain good local pointers.

I'm not familiar with forrest, could you help me setting the link?

The pages to be linked are these:
http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/overview-summary.html#overview_description
http://lucene.apache.org/java/2_1_0/api/overview-summary.html#overview_description
(etc)

Note that this is not the API docs page (which contains the frameset) but a 
content page plus an anchor. So I cannot use a href=ext:javadocs but 
a href=ext:javadocs/overview-summary.html#overview_description doesn't 
work either.

Regards
 Daniel

-- 
http://www.danielnaber.de

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



linking the API docs

2007-04-06 Thread Daniel Naber
Hi,

we have a short but (I think) useful snippet of example code in our API 
docs:

http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/overview-summary.html#overview_description

We also have the Getting started section on the web site, which only 
refers to the demo and doesn't offer code examples:

http://lucene.apache.org/java/docs/gettingstarted.html

I'd like to link from the Getting Started to the API example. Is it okay 
to just put the above link (lucene.zones.apache.org) in the file or isn't 
that supposed to be stable? If that's not okay, the best thing might be to 
move the code example to the Getting Started.

Regards
 Daniel

-- 
http://www.danielnaber.de

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-841) Replace UTF8 characters in stemmer code with integer values.

2007-03-21 Thread Daniel Naber (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12482914
 ] 

Daniel Naber commented on LUCENE-841:
-

Which environments still don't handle UTF-8? Using anything that escapes the 
real characters will make the code difficult to read.


 Replace UTF8 characters in stemmer code with integer values.
 

 Key: LUCENE-841
 URL: https://issues.apache.org/jira/browse/LUCENE-841
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Analysis
Reporter: Karl Wettin
Priority: Critical

 BrazillianStemmer, GermanStemmer, FrenchStemmer and DutchStemmer all contains 
 UTF characters in the java code. All environments does not handle that. It 
 really ought to be integer values instead.
 I'll come up with a patch sooner or later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Created: (LUCENE-795) deprecate Directory.renameFile()

2007-02-05 Thread Daniel Naber (JIRA)
deprecate Directory.renameFile()


 Key: LUCENE-795
 URL: https://issues.apache.org/jira/browse/LUCENE-795
 Project: Lucene - Java
  Issue Type: Bug
  Components: Store
Affects Versions: 2.0.0
Reporter: Daniel Naber
Priority: Minor
 Fix For: 2.1


Copied from my mailing list post so this issue can be tracked (if necessary). I 
will commit a patch.

I see that Directory.renameFile() isn't used anymore. I assume it has only 
been public for technical reasons, not because we expect this to be used 
from outside of Lucene? Should we deprecate this method? Its 
implementation e.g. in FSDirectory looks a bit scary anyway (the comment 
correctly says This is not atomic while the abstract class says This 
replacement should be atomic).


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Closed: (LUCENE-795) deprecate Directory.renameFile()

2007-02-05 Thread Daniel Naber (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Naber closed LUCENE-795.
---

Resolution: Fixed

Committed.

 deprecate Directory.renameFile()
 

 Key: LUCENE-795
 URL: https://issues.apache.org/jira/browse/LUCENE-795
 Project: Lucene - Java
  Issue Type: Bug
  Components: Store
Affects Versions: 2.0.0
Reporter: Daniel Naber
Priority: Minor
 Fix For: 2.1


 Copied from my mailing list post so this issue can be tracked (if necessary). 
 I will commit a patch.
 I see that Directory.renameFile() isn't used anymore. I assume it has only 
 been public for technical reasons, not because we expect this to be used 
 from outside of Lucene? Should we deprecate this method? Its 
 implementation e.g. in FSDirectory looks a bit scary anyway (the comment 
 correctly says This is not atomic while the abstract class says This 
 replacement should be atomic).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



deprecate Directory.renameFile()?

2007-02-03 Thread Daniel Naber
Hi,

I see that Directory.renameFile() isn't used anymore. I assume it has only 
been public for technical reasons, not because we expect this to be used 
from outside of Lucene? Should we deprecate this method? Its 
implementation e.g. in FSDirectory looks a bit scary anyway (the comment 
correctly says This is not atomic while the abstract class says This 
replacement should be atomic).

Regards
 Daniel

-- 
http://www.danielnaber.de

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-781) NPE in MultiReader.isCurrent() and getVersion()

2007-01-29 Thread Daniel Naber (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Naber updated LUCENE-781:


Attachment: multireader.diff

updated patch

 NPE in MultiReader.isCurrent() and getVersion()
 ---

 Key: LUCENE-781
 URL: https://issues.apache.org/jira/browse/LUCENE-781
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Reporter: Daniel Naber
 Attachments: multireader.diff, multireader.diff, 
 multireader_test.diff, multireader_test.diff


 I'm attaching a fix for the NPE in MultiReader.isCurrent() plus a testcase. 
 For getVersion(), we should throw a better exception that NPE. I will commit 
 unless someone objects or has a better idea.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-781) NPE in MultiReader.isCurrent() and getVersion()

2007-01-29 Thread Daniel Naber (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Naber updated LUCENE-781:


Attachment: multireader_test.diff

updated patch

 NPE in MultiReader.isCurrent() and getVersion()
 ---

 Key: LUCENE-781
 URL: https://issues.apache.org/jira/browse/LUCENE-781
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Reporter: Daniel Naber
 Attachments: multireader.diff, multireader.diff, 
 multireader_test.diff, multireader_test.diff


 I'm attaching a fix for the NPE in MultiReader.isCurrent() plus a testcase. 
 For getVersion(), we should throw a better exception that NPE. I will commit 
 unless someone objects or has a better idea.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-781) NPe in MultiReader.isCurrent() and getVersion()

2007-01-22 Thread Daniel Naber (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Naber updated LUCENE-781:


Attachment: multireader.diff

 NPe in MultiReader.isCurrent() and getVersion()
 ---

 Key: LUCENE-781
 URL: https://issues.apache.org/jira/browse/LUCENE-781
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Reporter: Daniel Naber
 Attachments: multireader.diff, multireader_test.diff


 I'm attaching a fix for the NPE in MultiReader.isCurrent() plus a testcase. 
 For getVersion(), we should throw a better exception that NPE. I will commit 
 unless someone objects or has a better idea.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Lucene Scalability Question

2007-01-08 Thread Daniel Naber
On Monday 08 January 2007 20:33, Ali Salehi wrote:

  1. The search time for simple queries such as precision:\+0002 is
 really high (4-10 seconds). I want to know if this search time is normal

  2. The search gives TooManyClauses exception when I'm searching for a
  data item with the queries similar to the one below :

Please see the FAQ at http://wiki.apache.org/jakarta-lucene/LuceneFAQ:
Why am I getting a TooManyClauses exception?
How do I speed up searching?

If that doesn't help, please re-post you question on the user list.

Regards
 Daniel

-- 
http://www.danielnaber.de

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-765) Index package level javadocs needs content

2007-01-04 Thread Daniel Naber (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12462203
 ] 

Daniel Naber commented on LUCENE-765:
-

Some of this is already here:
http://lucene.apache.org/java/docs/api/overview-summary.html#overview_description

 Index package level javadocs needs content
 --

 Key: LUCENE-765
 URL: https://issues.apache.org/jira/browse/LUCENE-765
 Project: Lucene - Java
  Issue Type: Task
  Components: Javadocs
Reporter: Grant Ingersoll
Priority: Minor

 The org.apache.lucene.index package level javadocs are sorely lacking.  They 
 should be updated to give a summary of the important classes, how indexing 
 works, etc.  Maybe give an overview of how the different writers coordinate.  
 Links to file formats, information on the posting algorithm, etc. would be 
 helpful.
 See the search package javadocs as a sample of the kind of info that could go 
 here.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: access policy for Java Open Review Project

2006-12-27 Thread Daniel Naber
On Wednesday 27 December 2006 01:38, Erik Hatcher wrote:

 I'd be surprised if anyone uses Lucli, given the limited utility it  
 has versus using Luke.

It's actually very useful if you only have ssh access to a machine that has 
no X11 running. I just fixed the small bug found by this review.

Regards
 Daniel

-- 
http://www.danielnaber.de

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Reopened: (LUCENE-707) Lucene Java Site docs

2006-11-29 Thread Daniel Naber (JIRA)
 [ http://issues.apache.org/jira/browse/LUCENE-707?page=all ]

Daniel Naber reopened LUCENE-707:
-

 
The link to the image (asf-logo.gif) in the upper left corner is broken (mhh, 
same problem at Nutch site).


 Lucene Java Site docs
 -

 Key: LUCENE-707
 URL: http://issues.apache.org/jira/browse/LUCENE-707
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Website
 Environment: N/A
Reporter: Grant Ingersoll
 Assigned To: Grant Ingersoll
Priority: Minor

 It would be really nice if the Java site docs where consistent with the rest 
 of the Lucene family (namely, with navigation tabs, etc.) so that one can 
 easily go between Nutch, Hadoop, etc.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-732) Use DateTools instead of deprecated DateField in QueryParser

2006-11-29 Thread Daniel Naber (JIRA)
[ 
http://issues.apache.org/jira/browse/LUCENE-732?page=comments#action_12454449 ] 

Daniel Naber commented on LUCENE-732:
-

I'm not sure if most people use DateTools already, as it has just been added in 
Lucene 1.9. Maybe you could consider an option (yes, yet another option isn't 
nice, I know)? Otherwise we need to properly document how to continue using 
DateField, i.e. by extending QueryParser and overwriting this method I guess.


 Use DateTools instead of deprecated DateField in QueryParser
 

 Key: LUCENE-732
 URL: http://issues.apache.org/jira/browse/LUCENE-732
 Project: Lucene - Java
  Issue Type: Improvement
  Components: QueryParser
Reporter: Michael Busch
 Assigned To: Michael Busch
Priority: Minor
 Attachments: queryparser_datetools.patch


 The QueryParser currently uses the deprecated class DateField to create 
 RangeQueries with date values. However, most users probably use DateTools to 
 store date values in their indexes, because this is the recommended way since 
 DateField has been deprecated. In that case RangeQueries with date values 
 produced by the QueryParser won't work with those indexes.
 This patch replaces the use of DateField in QueryParser by DateTools. Because 
 DateTools can produce date values with different resolutions, this patch adds 
 the following methods to QueryParser:
   /**
* Sets the default date resolution used by RangeQueries for fields for 
 which no
* specific date resolutions has been set. Field specific resolutions can 
 be set
* with [EMAIL PROTECTED] #setDateResolution(String, DateTools.Resolution)}.
*  
* @param dateResolution the default date resolution to set
*/
   public void setDateResolution(DateTools.Resolution dateResolution);
   
   /**
* Sets the date resolution used by RangeQueries for a specific field.
*  
* @param field field for which the date resolution is to be set 
* @param dateResolution date resolution to set
*/
   public void setDateResolution(String fieldName, DateTools.Resolution 
 dateResolution);
 (I also added the corresponding getter methods).
 Now the user can set a default date resolution used for all fields or, with 
 the second method, field specific date resolutions.
 The initial default resolution, which is used if the user does not set a 
 different resolution, is DateTools.Resolution.DAY. 
 Please let me know if you think we should use a different resolution as 
 default.
 I extended TestQueryParser to test this new feature.
 All unit tests pass.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Resolved: (LUCENE-722) DEFAULT spelled DEFALT in MoreLikeThis.java

2006-11-22 Thread Daniel Naber (JIRA)
 [ http://issues.apache.org/jira/browse/LUCENE-722?page=all ]

Daniel Naber resolved LUCENE-722.
-

Resolution: Fixed

Okay, unless there's a third version of that file it's fixed now :-)

 DEFAULT spelled DEFALT in MoreLikeThis.java
 ---

 Key: LUCENE-722
 URL: http://issues.apache.org/jira/browse/LUCENE-722
 Project: Lucene - Java
  Issue Type: Bug
  Components: Search
Affects Versions: 2.0.0
 Environment: all
Reporter: Andi Vajda
Priority: Minor
 Fix For: 2.1


 DEFAULT is spelled DEFALT in 
 contrib/queries/src/java/org/apache/lucene/search/similar/MoreLikeThis.java

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Closed: (LUCENE-656) FieldsInfo uses deprecated API

2006-08-16 Thread Daniel Naber (JIRA)
 [ http://issues.apache.org/jira/browse/LUCENE-656?page=all ]

Daniel Naber closed LUCENE-656.
---

Resolution: Fixed

Thanks, patch is committed.


 FieldsInfo uses deprecated API
 --

 Key: LUCENE-656
 URL: http://issues.apache.org/jira/browse/LUCENE-656
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 2.0.1
Reporter: Simon Willnauer
Priority: Minor
 Attachments: FieldsInfo.diff


 The class FieldsInfo.java uses deprecated API in method public void 
 add(Document doc)
 I rused the replacement and created the patch - see attachment

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Closed: (LUCENE-649) Fixed Spelling mailinglist.xml

2006-08-16 Thread Daniel Naber (JIRA)
 [ http://issues.apache.org/jira/browse/LUCENE-649?page=all ]

Daniel Naber closed LUCENE-649.
---

Resolution: Fixed

Thanks, committed.

 Fixed Spelling mailinglist.xml
 --

 Key: LUCENE-649
 URL: http://issues.apache.org/jira/browse/LUCENE-649
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Website
Affects Versions: 2.0.1
Reporter: Simon Willnauer
Priority: Trivial
 Attachments: mailinglist_xml.diff


 Just fixed some spelling in the mailinglist.xml in /java/trunk/xdocs

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-388) [PATCH] IndexWriter.maybeMergeSegments() takes lots of CPU resources

2006-08-14 Thread Daniel Naber (JIRA)
[ 
http://issues.apache.org/jira/browse/LUCENE-388?page=comments#action_12427967 ] 

Daniel Naber commented on LUCENE-388:
-

Hi Yonik, I just tested the patch: sorry, but the problem is the same as 
before: I get an OutOfMemoryError using settings that without the patch. That 
doesn't mean that the patch is wrong of course, but as we're after performance 
improvements it wouldn't make sense to compare it to the old version which uses 
less memory.


 [PATCH] IndexWriter.maybeMergeSegments() takes lots of CPU resources
 

 Key: LUCENE-388
 URL: http://issues.apache.org/jira/browse/LUCENE-388
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: CVS Nightly - Specify date in submission
 Environment: Operating System: Mac OS X 10.3
 Platform: Macintosh
Reporter: Paul Smith
 Assigned To: Yonik Seeley
 Attachments: IndexWriter.patch, log-compound.txt, 
 log.optimized.deep.txt, log.optimized.txt, Lucene Performance Test - with  
 without hack.xls, lucene.34930.patch, yonik_indexwriter.diff


 Note: I believe this to be the same situation with 1.4.3 as with SVN HEAD.
 Analysis using hprof utility shows that during index creation with many
 documents highlights that the CPU spends a large portion of it's time in
 IndexWriter.maybeMergeSegments(), which seems to be a 'waste' compared with
 other valuable CPU intensive operations such as tokenization etc.
 Using the following test snippet to retrieve some rows from the db and create 
 an
 index:
 Analyzer a = new StandardAnalyzer();
 writer = new IndexWriter(indexDir, a, true);
 writer.setMergeFactor(1000);
 writer.setMaxBufferedDocs(1);
 writer.setUseCompoundFile(false);
 connection = DriverManager.getConnection(
 jdbc:inetdae7:tower.aconex.com?database=somedb, secret,
 squirrel);
 String sql = select userid, userfirstname, userlastname, email from 
 userx;
 LOG.info(sql= + sql);
 Statement statement = connection.createStatement();
 statement.setFetchSize(5000);
 LOG.info(Executing sql);
 ResultSet rs = statement.executeQuery(sql);
 LOG.info(ResultSet retrieved);
 int row = 0;
 LOG.info(Indexing users);
 long begin = System.currentTimeMillis();
 while (rs.next()) {
 int userid = rs.getInt(1);
 String firstname = rs.getString(2);
 String lastname = rs.getString(3);
 String email = rs.getString(4);
 String fullName = firstname +   + lastname;
 Document doc = new Document();
 doc.add(Field.Keyword(userid, userid+));
 doc.add(Field.Keyword(firstname, firstname.toLowerCase()));
 doc.add(Field.Keyword(lastname, lastname.toLowerCase()));
 doc.add(Field.Text(name, fullName.toLowerCase()));
 doc.add(Field.Keyword(email, email.toLowerCase()));
 writer.addDocument(doc);
 row++;
 if((row % 100)==0){
 LOG.info(row +  indexed);
 }
 }
 double end = System.currentTimeMillis();
 double diff = (end-begin)/1000;
 double rate = row/diff;
 LOG.info(rate: +rate);
 On my 1.5GHz PowerBook with 1.5Gb RAM and a 5400 RPM drive, my CPU is maxed 
 out,
 and I end up getting a rate of indexing between 490-515 documents/second run
 over 10 times in succession.  
 By applying a simple patch to IndexWriter (see attached shortly), which defers
 the calling of maybeMergeSegments() so that it is only called every 2000
 times(an arbitrary figure), I appear to get a new rate of between 945-970
 documents/second.  Using Luke to look inside each index created between these 
 2
 there does not appear to be any difference.  Same number of Documents, same
 number of Terms.
 I'm not suggesting one should apply this patch, I'm just highlighting the
 difference in performance that this sort of change gives you.  
 We are about to use Lucene to index 4 million construction document records, 
 and
 so speeding up the indexing process is in our best interest! :)  If one
 considers the amount of CPU time spent in maybeMergeSegments over the initial
 index creation of 4 million documents, I think one could see how it would be
 ideal to try to speed this area up (at least move the bottleneck to IO). 
 I woul appreciate anyone taking a moment to comment on this.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Reopened: (LUCENE-388) [PATCH] IndexWriter.maybeMergeSegments() takes lots of CPU resources

2006-08-13 Thread Daniel Naber (JIRA)
 [ http://issues.apache.org/jira/browse/LUCENE-388?page=all ]

Daniel Naber reopened LUCENE-388:
-

 
Something is wrong with this patch (as it has been applied) as it increases 
memory usage. Indexing files with the IndexFiles demo worked before using 
writer.setMaxBufferedDocs(50) and a tight JVM memory setting (-Xmx2M), now it 
fails with an OutOfMemoryError.


 [PATCH] IndexWriter.maybeMergeSegments() takes lots of CPU resources
 

 Key: LUCENE-388
 URL: http://issues.apache.org/jira/browse/LUCENE-388
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: CVS Nightly - Specify date in submission
 Environment: Operating System: Mac OS X 10.3
 Platform: Macintosh
Reporter: Paul Smith
 Attachments: IndexWriter.patch, log-compound.txt, 
 log.optimized.deep.txt, log.optimized.txt, Lucene Performance Test - with  
 without hack.xls, lucene.34930.patch


 Note: I believe this to be the same situation with 1.4.3 as with SVN HEAD.
 Analysis using hprof utility shows that during index creation with many
 documents highlights that the CPU spends a large portion of it's time in
 IndexWriter.maybeMergeSegments(), which seems to be a 'waste' compared with
 other valuable CPU intensive operations such as tokenization etc.
 Using the following test snippet to retrieve some rows from the db and create 
 an
 index:
 Analyzer a = new StandardAnalyzer();
 writer = new IndexWriter(indexDir, a, true);
 writer.setMergeFactor(1000);
 writer.setMaxBufferedDocs(1);
 writer.setUseCompoundFile(false);
 connection = DriverManager.getConnection(
 jdbc:inetdae7:tower.aconex.com?database=somedb, secret,
 squirrel);
 String sql = select userid, userfirstname, userlastname, email from 
 userx;
 LOG.info(sql= + sql);
 Statement statement = connection.createStatement();
 statement.setFetchSize(5000);
 LOG.info(Executing sql);
 ResultSet rs = statement.executeQuery(sql);
 LOG.info(ResultSet retrieved);
 int row = 0;
 LOG.info(Indexing users);
 long begin = System.currentTimeMillis();
 while (rs.next()) {
 int userid = rs.getInt(1);
 String firstname = rs.getString(2);
 String lastname = rs.getString(3);
 String email = rs.getString(4);
 String fullName = firstname +   + lastname;
 Document doc = new Document();
 doc.add(Field.Keyword(userid, userid+));
 doc.add(Field.Keyword(firstname, firstname.toLowerCase()));
 doc.add(Field.Keyword(lastname, lastname.toLowerCase()));
 doc.add(Field.Text(name, fullName.toLowerCase()));
 doc.add(Field.Keyword(email, email.toLowerCase()));
 writer.addDocument(doc);
 row++;
 if((row % 100)==0){
 LOG.info(row +  indexed);
 }
 }
 double end = System.currentTimeMillis();
 double diff = (end-begin)/1000;
 double rate = row/diff;
 LOG.info(rate: +rate);
 On my 1.5GHz PowerBook with 1.5Gb RAM and a 5400 RPM drive, my CPU is maxed 
 out,
 and I end up getting a rate of indexing between 490-515 documents/second run
 over 10 times in succession.  
 By applying a simple patch to IndexWriter (see attached shortly), which defers
 the calling of maybeMergeSegments() so that it is only called every 2000
 times(an arbitrary figure), I appear to get a new rate of between 945-970
 documents/second.  Using Luke to look inside each index created between these 
 2
 there does not appear to be any difference.  Same number of Documents, same
 number of Terms.
 I'm not suggesting one should apply this patch, I'm just highlighting the
 difference in performance that this sort of change gives you.  
 We are about to use Lucene to index 4 million construction document records, 
 and
 so speeding up the indexing process is in our best interest! :)  If one
 considers the amount of CPU time spent in maybeMergeSegments over the initial
 index creation of 4 million documents, I think one could see how it would be
 ideal to try to speed this area up (at least move the bottleneck to IO). 
 I woul appreciate anyone taking a moment to comment on this.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: svn commit: r428998 - in /lucene/java/trunk: CHANGES.txt src/java/org/apache/lucene/analysis/StopAnalyzer.java src/test/org/apache/lucene/analysis/TestStandardAnalyzer.java

2006-08-05 Thread Daniel Naber
On Samstag 05 August 2006 22:31, Yonik Seeley wrote:

 Stop words and stemming always make literal searching less precise,
 with the general benefit of greater matching power (more general) and
 smaller index size.

That's why I gave the t-online example: it makes the search result look 
incorrect but hardly helps reduce index size. t and s were probably 
added so don't doesn't get indexed as don, t, but this doesn't 
happen anyway as the StandardTokenizer keeps don't as a single token. 
's is cut off in StandardFilter.

In general, this is only a default list and people will need to adapt it 
anyway. So we should only add the words which are probably stopwords for 
most users.

Regards
 Daniel

-- 
http://www.danielnaber.de

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: StopAnalyzer in results.jsp?

2006-08-03 Thread Daniel Naber
On Donnerstag 03 August 2006 19:31, Michael McCandless wrote:

 But, in the process, I came across this inconsistency: for the Web
 application demo, the indexing (done by IndexHTML.java) uses the
 StandardAnalyzer but the searcher (in results.jsp) uses the
 StopAnalyzer.  Shouldn't they be the same?  Shouldn't we change
 results.jsp to use StandardAnalyzer?

Yes, you're right. Thanks for spotting it and for your (upcoming) fixes.

Regards
 Daniel

-- 
http://www.danielnaber.de

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Resolved: (LUCENE-646) [PATCH] fix various small issues with the getting started demo pages

2006-08-03 Thread Daniel Naber (JIRA)
 [ http://issues.apache.org/jira/browse/LUCENE-646?page=all ]

Daniel Naber resolved LUCENE-646.
-

Resolution: Fixed

Thanks, the patch has been committed and the changes should soon be visible on 
the web pages.


 [PATCH] fix various small issues with the getting started demo pages
 --

 Key: LUCENE-646
 URL: http://issues.apache.org/jira/browse/LUCENE-646
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Website
Affects Versions: 2.0.0
Reporter: Michael McCandless
Priority: Minor
 Attachments: gettingstarted.Aug3.patch


 This patch contains numerous small fixes for the getting started
 pages on the Lucene Java web site.  Here are the rough fixes:
   * To results.jsp:
 - changed StopAnalyzer - StandardAnalyzer
 - changed references of url to path (field url is never set
   and was therefore always null)
 - remove prefix of ../webapps from path so clicking through works
   * Fixed typos, grammar and other cosmetic things.
   * Modernized some things that have changed with time (names of JAR
 files, which languages have analyzers, etc.)
   * Added outbound links to Javadocs, Wiki, Lucene static web site,
 external sites, when appropriate.
   * Removed exact version of Tomcat for the demo web app (I think all
 recent versions of Tomcat will work as described)
   * Other small changes...
 Net/net I think this is an improved version of what's available on the
 site today.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Resolved: (LUCENE-641) maxFieldLength actual limit is 1 greater than expected value.

2006-07-30 Thread Daniel Naber (JIRA)
 [ http://issues.apache.org/jira/browse/LUCENE-641?page=all ]

Daniel Naber resolved LUCENE-641.
-

Resolution: Fixed

Thanks for the report, this has now been fixed in SVN trunk.


 maxFieldLength actual limit is 1 greater than expected value.
 -

 Key: LUCENE-641
 URL: http://issues.apache.org/jira/browse/LUCENE-641
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: 2.0.0
 Environment: JSE 5.0
Reporter: Topbit Du
Priority: Minor

 // Prepare document.
 Document document = new Document();
 document.add(new Field(name,
 pattern oriented software architecture, Store.NO,
 Index.TOKENIZED, TermVector.WITH_POSITIONS_OFFSETS));
 // Set max field length to 2.
 indexWriter.setMaxFieldLength(2);
 // Add document into index.
 indexWriter.addDocument(document, new StandardAnalyzer());
 // Create a query.
 QueryParser queryParser = new QueryParser(name, new StandardAnalyzer());
 Query query = queryParser.parse(software);
 // Search the 3rd term.
 Hits hits = indexSearcher.search(query);
 Assert.assertEquals(0, hits.length());
 // failed. Actual hits.length() == 1, but expect 0.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-603) index optimize problem

2006-07-30 Thread Daniel Naber (JIRA)
[ 
http://issues.apache.org/jira/browse/LUCENE-603?page=comments#action_12424388 ] 

Daniel Naber commented on LUCENE-603:
-

Is there any chance you could provide a test case that demonstrates this 
problem?

 index optimize problem
 --

 Key: LUCENE-603
 URL: http://issues.apache.org/jira/browse/LUCENE-603
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: 1.9
 Environment: CentOS 4.0 , Lucene 1.9, Eclipse 3.1
Reporter: Dedian Guo

 have a function whichi is loop to index batches of documents, after each 
 indexing, the function IndexWriter.optimize will be applied. for several 
 times (not sure how many, but should be many), following exception was thrown 
 out.
 Exception in thread Thread-0 java.lang.IllegalStateException: docs out of 
 order
   at 
 org.apache.lucene.index.SegmentMerger.appendPostings(SegmentMerger.java:335)
   at 
 org.apache.lucene.index.SegmentMerger.mergeTermInfo(SegmentMerger.java:298)
   at 
 org.apache.lucene.index.SegmentMerger.mergeTermInfos(SegmentMerger.java:272)
   at 
 org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:236)
   at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:89)
   at 
 org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:681)
   at 
 org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:658)
   at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:517)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Closed: (LUCENE-638) Can't put non-index files (e.g. CVS, SVN directories) in a Lucene index directory

2006-07-29 Thread Daniel Naber (JIRA)
 [ http://issues.apache.org/jira/browse/LUCENE-638?page=all ]

Daniel Naber closed LUCENE-638.
---

Resolution: Fixed

Thanks, this has now been fixed in trunk.


 Can't put non-index files (e.g. CVS, SVN directories) in a Lucene index 
 directory
 -

 Key: LUCENE-638
 URL: http://issues.apache.org/jira/browse/LUCENE-638
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Eleanor Joslin
Priority: Minor
 Attachments: LuceneTest.java


 Lucene won't tolerate foreign files in its index directories.  This makes it 
 impossible to keep an index in a CVS or Subversion repository.
 For instance, this exception appears when creating a RAMDirectory from a 
 java.io.File that contains a subdirectory called .svn.
 java.io.FileNotFoundException: /home/local/ejj/ic/.caches/.search/.index/.svn
 (Is a directory)
 at java.io.RandomAccessFile.open(Native Method)
 at java.io.RandomAccessFile.init(RandomAccessFile.java:212)
 at
 org.apache.lucene.store.FSIndexInput$Descriptor.init(FSDirectory.java:425)
 at org.apache.lucene.store.FSIndexInput.init(FSDirectory.java:434)
 at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:324)
 at org.apache.lucene.store.RAMDirectory.init(RAMDirectory.java:61)
 at org.apache.lucene.store.RAMDirectory.init(RAMDirectory.java:86)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-638) Can't put non-index files (e.g. CVS, SVN directories) in a Lucene index directory

2006-07-27 Thread Daniel Naber (JIRA)
[ 
http://issues.apache.org/jira/browse/LUCENE-638?page=comments#action_12423893 ] 

Daniel Naber commented on LUCENE-638:
-

What exactly does your code look like? Something else must be wrong because I 
use an index that's committed to CVS without problems (using Lucene 2.0).


 Can't put non-index files (e.g. CVS, SVN directories) in a Lucene index 
 directory
 -

 Key: LUCENE-638
 URL: http://issues.apache.org/jira/browse/LUCENE-638
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Eleanor Joslin
Priority: Minor

 Lucene won't tolerate foreign files in its index directories.  This makes it 
 impossible to keep an index in a CVS or Subversion repository.
 For instance, this exception appears when creating a RAMDirectory from a 
 java.io.File that contains a subdirectory called .svn.
 java.io.FileNotFoundException: /home/local/ejj/ic/.caches/.search/.index/.svn
 (Is a directory)
 at java.io.RandomAccessFile.open(Native Method)
 at java.io.RandomAccessFile.init(RandomAccessFile.java:212)
 at
 org.apache.lucene.store.FSIndexInput$Descriptor.init(FSDirectory.java:425)
 at org.apache.lucene.store.FSIndexInput.init(FSDirectory.java:434)
 at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:324)
 at org.apache.lucene.store.RAMDirectory.init(RAMDirectory.java:61)
 at org.apache.lucene.store.RAMDirectory.init(RAMDirectory.java:86)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Closed: (LUCENE-634) QueryParser is not applicable for the arguments (String, String, Analyzer) error in results.jsp when executing search in the browser (demo from Lucene 2.0)

2006-07-25 Thread Daniel Naber (JIRA)
 [ http://issues.apache.org/jira/browse/LUCENE-634?page=all ]

Daniel Naber closed LUCENE-634.
---

Fix Version/s: 2.0.1
   Resolution: Fixed

This has been fixed after 2.0.


 QueryParser is not applicable for the arguments (String, String, Analyzer) 
 error in results.jsp when executing search in the browser (demo from Lucene 
 2.0)
 ---

 Key: LUCENE-634
 URL: http://issues.apache.org/jira/browse/LUCENE-634
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 2.0.0
 Environment: Windows XP
 Tomcat 5.5 
Reporter: Aliaksandr Birukou
 Fix For: 2.0.1


 When executing search in the browser (as described in demo3.html Lucene demo) 
 I get error, because the demo uses the method (QueryParser with three 
 arguments) which is deleted (it was deprecated).
 I checked the demo from Lucene 1.4-final it with Lucene 1.4-final - it works, 
 because those time the method was there.
 But demo from Lucene 2.0 does not work with Lucene 2.0
 The error stack is here:
 TTP Status 500 -
 type Exception report
 message
 description The server encountered an internal error () that prevented it 
 from fulfilling this request.
 exception
 org.apache.jasper.JasperException: Unable to compile class for JSP
 An error occurred at line: 60 in the jsp file: /results.jsp
 Generated servlet error:
 The method parse(String) in the type QueryParser is not applicable for the 
 arguments (String, String, Analyzer)
 org.apache.jasper.servlet.JspServletWrapper.handleJspException(JspServletWrapper.java:510)
 org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:375)
 org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:314)
 org.apache.jasper.servlet.JspServlet.service(JspServlet.java:264)
 javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
 root cause
 org.apache.jasper.JasperException: Unable to compile class for JSP
 An error occurred at line: 60 in the jsp file: /results.jsp
 Generated servlet error:
 The method parse(String) in the type QueryParser is not applicable for the 
 arguments (String, String, Analyzer)
 org.apache.jasper.compiler.DefaultErrorHandler.javacError(DefaultErrorHandler.java:84)
 org.apache.jasper.compiler.ErrorDispatcher.javacError(ErrorDispatcher.java:328)
 org.apache.jasper.compiler.JDTCompiler.generateClass(JDTCompiler.java:409)
 org.apache.jasper.compiler.Compiler.compile(Compiler.java:297)
 org.apache.jasper.compiler.Compiler.compile(Compiler.java:276)
 org.apache.jasper.compiler.Compiler.compile(Compiler.java:264)
 org.apache.jasper.JspCompilationContext.compile(JspCompilationContext.java:563)
 org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:303)
 org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:314)
 org.apache.jasper.servlet.JspServlet.service(JspServlet.java:264)
 javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
 note The full stack trace of the root cause is available in the Apache 
 Tomcat/5.5.15 logs.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



ant javacc-QueryParser

2006-07-23 Thread Daniel Naber
Hi,

as I cannot get ant javacc-QueryParser working I manually applied the 
changes from my latest commit to QueryParser.java. The change was very 
simple so I think this should be okay. Maybe someone can run ant 
javacc-QueryParser just to be sure.

Regards
 Daniel

-- 
http://www.danielnaber.de

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: ant javacc-QueryParser

2006-07-23 Thread Daniel Naber
On Sonntag 23 Juli 2006 15:02, Simon Willnauer wrote:

 Did you set the property in your common-build.xml?

Yes, but I always get Could not create task or type of type: javacc. I 
use a javacc that I downloaded and installed (i.e. unpacked) manually.

Regards
 Daniel

-- 
http://www.danielnaber.de

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: svn commit: r424449 - /lucene/java/trunk/src/java/org/apache/lucene/document/DateTools.java

2006-07-22 Thread Daniel Naber
On Samstag 22 Juli 2006 07:58, Chris Hostetter wrote:

 however i'm not sure if the
 performance benefits of the static instances Daniel mentioned in his
 commit will exist in a multithreaded app (the synchronization costs may
 outway the instantiation costs)

I created a micro benchmark with 2 to 4 threads and the new version was 
faster about a factor of at least 2.

Regards
 Daniel

-- 
http://www.danielnaber.de

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Closed: (LUCENE-630) results.jsp in luceneweb.war uses unknown parse-Method

2006-07-19 Thread Daniel Naber (JIRA)
 [ http://issues.apache.org/jira/browse/LUCENE-630?page=all ]

Daniel Naber closed LUCENE-630.
---

Fix Version/s: 2.0.1
   Resolution: Fixed

This has been fixed some time ago (after the 2.0 release).


 results.jsp in luceneweb.war uses unknown parse-Method
 --

 Key: LUCENE-630
 URL: http://issues.apache.org/jira/browse/LUCENE-630
 Project: Lucene - Java
  Issue Type: Bug
  Components: Examples
Affects Versions: 2.0.0
 Environment: Windows XP Pro and Linux (Ubuntu 6.06 TLS)
 Tomcat 5.5
 Sun Java 1.5_07
Reporter: Philip Reimer
Priority: Trivial
 Fix For: 2.0.1


 results.jsp in luceneweb.war demo throws JasperException:
 org.apache.jasper.JasperException: Unable to compile class for JSP
 An error occurred at line: 60 in the jsp file: /results.jsp
 Generated servlet error:
 The method parse(String) in the type QueryParser is not applicable for the 
 arguments (String, String, Analyzer)
 I think, the code in line 81 of results.jsp should maybe look like the 
 following ?
 QueryParser qp = new QueryParser(contents, analyzer);
 query = qp.parse(queryString);

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Closed: (LUCENE-101) Selecting a language-specific analyzer according to a locale.

2006-07-18 Thread Daniel Naber (JIRA)
 [ http://issues.apache.org/jira/browse/LUCENE-101?page=all ]

Daniel Naber closed LUCENE-101.
---

Resolution: Fixed

Closing, the code changes the original report talks about don't seem to be 
needed anymore today.


 Selecting a language-specific analyzer according to a locale.
 -

 Key: LUCENE-101
 URL: http://issues.apache.org/jira/browse/LUCENE-101
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Analysis
Affects Versions: unspecified
 Environment: Operating System: other
 Platform: Other
Reporter: Eric Isakson
Priority: Minor

 Moved from todo.xml:
 Now we rewrite parts of Lucene code in order to use another analyzer. It will 
 be useful to select analyzer without touching code.
 This was orginally request by Kazuhiro Kazama ([EMAIL PROTECTED]) in
 http://nagoya.apache.org/eyebrowse/ReadMsg?listName=lucene-
 [EMAIL PROTECTED]msgId=338928
 Not sure if this was completed to Kazuhiro Kazama's satisfaction in the 
 current 
 CVS. We can certainly choose which analyzer to use for a given IndexWriter 
 and 
 QueryParser it sounded like he was asking for something like a factory the 
 would create an analyzer based on a locale but unless I don't understand 
 things 
 quite right, searching an index with any analyzer that you didn't create the 
 index with is bound to cause you to have false hits in your results.
 Perhaps this is fixed or no action should be taken. Can someone with a better 
 understanding of the request comment on this one or close it out?

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Resolved: (LUCENE-608) deprecate Document.fields(), add getFields()

2006-06-22 Thread Daniel Naber (JIRA)
 [ http://issues.apache.org/jira/browse/LUCENE-608?page=all ]
 
Daniel Naber resolved LUCENE-608:
-

Resolution: Fixed

The patch has been committed.


 deprecate Document.fields(), add getFields()
 

  Key: LUCENE-608
  URL: http://issues.apache.org/jira/browse/LUCENE-608
  Project: Lucene - Java
 Type: Improvement

   Components: Other
 Versions: 2.0.0
 Reporter: Daniel Naber
  Fix For: 2.1
  Attachments: document.diff

 A simple API improvement that I'm going to commit if nobody objects.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Created: (LUCENE-608) deprecate Document.fields(), add getFields()

2006-06-20 Thread Daniel Naber (JIRA)
deprecate Document.fields(), add getFields()


 Key: LUCENE-608
 URL: http://issues.apache.org/jira/browse/LUCENE-608
 Project: Lucene - Java
Type: Improvement

  Components: Other  
Versions: 2.0.0
Reporter: Daniel Naber
 Fix For: 2.1
 Attachments: document.diff

A simple API improvement that I'm going to commit if nobody objects.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-608) deprecate Document.fields(), add getFields()

2006-06-20 Thread Daniel Naber (JIRA)
 [ http://issues.apache.org/jira/browse/LUCENE-608?page=all ]

Daniel Naber updated LUCENE-608:


Attachment: document.diff

 deprecate Document.fields(), add getFields()
 

  Key: LUCENE-608
  URL: http://issues.apache.org/jira/browse/LUCENE-608
  Project: Lucene - Java
 Type: Improvement

   Components: Other
 Versions: 2.0.0
 Reporter: Daniel Naber
  Fix For: 2.1
  Attachments: document.diff

 A simple API improvement that I'm going to commit if nobody objects.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-590) Demo HTML parser gives incorrect summaries when title is repeated as a heading

2006-06-15 Thread Daniel Naber (JIRA)
 [ http://issues.apache.org/jira/browse/LUCENE-590?page=all ]

Daniel Naber updated LUCENE-590:


Description: 
If you have an html document where the title is repeated as a heading at the 
top of the document, the HTMLParser will return the title as the summary, 
ignoring everything else that was added to the summary. Instead, it should keep 
the rest of the summary and chop off the title part at the beginning 
(essentially the opposite). I don't see any benefit to repeating the title in 
the summary for any case.

In HTMLParser.jj's getSummary():

String sum = summary.toString().trim();
String tit = getTitle();
if (sum.startsWith(tit) || sum.equals())
  return tit;
else
  return sum;

change it to: (* denotes a line that has changed)

String sum = summary.toString().trim();
String tit = getTitle();
*if (sum.startsWith(tit)) // don't repeat title in summary
*  return sum.substring(tit.length()).trim();
else
  return sum;


  was:

If you have an html document where the title is repeated as a heading at the 
top of the document, the HTMLParser will return the title as the summary, 
ignoring everything else that was added to the summary. Instead, it should keep 
the rest of the summary and chop off the title part at the beginning 
(essentially the opposite). I don't see any benefit to repeating the title in 
the summary for any case.

In HTMLParser.jj's getSummary():

String sum = summary.toString().trim();
String tit = getTitle();
if (sum.startsWith(tit) || sum.equals())
  return tit;
else
  return sum;

change it to: (* denotes a line that has changed)

String sum = summary.toString().trim();
String tit = getTitle();
*if (sum.startsWith(tit)) // don't repeat title in summary
*  return sum.substring(tit.length()).trim();
else
  return sum;


   Priority: Minor  (was: Major)

decrease priority (affects demo only)

 Demo HTML parser gives incorrect summaries when title is repeated as a heading
 --

  Key: LUCENE-590
  URL: http://issues.apache.org/jira/browse/LUCENE-590
  Project: Lucene - Java
 Type: Bug

   Components: Examples
 Versions: 2.0.0
 Reporter: Curtis d'Entremont
 Priority: Minor


 If you have an html document where the title is repeated as a heading at the 
 top of the document, the HTMLParser will return the title as the summary, 
 ignoring everything else that was added to the summary. Instead, it should 
 keep the rest of the summary and chop off the title part at the beginning 
 (essentially the opposite). I don't see any benefit to repeating the title in 
 the summary for any case.
 In HTMLParser.jj's getSummary():
 String sum = summary.toString().trim();
 String tit = getTitle();
 if (sum.startsWith(tit) || sum.equals())
   return tit;
 else
   return sum;
 change it to: (* denotes a line that has changed)
 String sum = summary.toString().trim();
 String tit = getTitle();
 *if (sum.startsWith(tit)) // don't repeat title in summary
 *  return sum.substring(tit.length()).trim();
 else
   return sum;

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-525) A standard Lucene install that works for simple web sites

2006-06-15 Thread Daniel Naber (JIRA)
 [ http://issues.apache.org/jira/browse/LUCENE-525?page=all ]

Daniel Naber updated LUCENE-525:


Priority: Minor  (was: Major)

decrease priority

 A standard Lucene install that works for simple web sites
 -

  Key: LUCENE-525
  URL: http://issues.apache.org/jira/browse/LUCENE-525
  Project: Lucene - Java
 Type: New Feature

  Environment: web site
 Reporter: Dave Yost
 Priority: Minor


 I'm new to Lucene.  I would like to be able to download a blob, install it, 
 set a few settings, preferably in a GUI, and be on the air with search 
 enabled on my static web site.
 What I find on the Examples page is nothing like this.  It is a collection of 
 stuff that leads me to believe that I'll have to become expert in all sorts 
 of Lucene arcana before I can get to my goal.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-562) Allow Unstored AND Unindexed Fields as in 1.4

2006-06-15 Thread Daniel Naber (JIRA)
[ 
http://issues.apache.org/jira/browse/LUCENE-562?page=comments#action_12416415 ] 

Daniel Naber commented on LUCENE-562:
-

I think this should be closed as won't fix. You could either write your 
wrapper class or just use an indexed or stored field that later gets removed. 
The stored/indexed value should only have an effect once the document is added 
to the index.


 Allow Unstored AND Unindexed Fields as in 1.4
 -

  Key: LUCENE-562
  URL: http://issues.apache.org/jira/browse/LUCENE-562
  Project: Lucene - Java
 Type: Bug

 Versions: 1.9
 Reporter: Sam Hough
 Priority: Minor


 In 1.4 it was possible to have a field that was not to be indexed or stored. 
 This was
 useful in passing information that Lucene should ignore but that layers on top
 of it should pickup. This saves the need for an extra class to wrap a Lucene 
 Document.
 Sorry it has taken me two years to spot the change:
 http://svn.apache.org/viewcvs.cgi/lucene/java/trunk/src/java/org/apache/lucene/document/Field.java?rev=150206r1=149967r2=150206diff_format=h
 I have to admit that this really isn't a Lucene bug but the 1.4 behaviour was 
 really handy
 like XML processing instructions.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-559) Turkish Analyzer for Lucene

2006-06-15 Thread Daniel Naber (JIRA)
[ 
http://issues.apache.org/jira/browse/LUCENE-559?page=comments#action_12416407 ] 

Daniel Naber commented on LUCENE-559:
-

Thanks for your contribution. Could you write some unit tests for your classes, 
similar to the existing tests for other languages?


 Turkish Analyzer for Lucene
 ---

  Key: LUCENE-559
  URL: http://issues.apache.org/jira/browse/LUCENE-559
  Project: Lucene - Java
 Type: Improvement

   Components: Analysis
 Reporter: Emre Bayram
  Attachments: TurkishAnalyzer.java, TurkishAnalyzer.java, 
 TurkishStemFilter.java, TurkishStemFilter.java, TurkishStemmer.java, 
 TurkishStemmer.java

 I have developed an Analyzer for Turkish, thanks to German Language Analyzer 
 and Brazillian Language Analyzers.
 This Turkish Analyzer supports iso-8859-9 character set(Turkish) and have a 
 nice stop words set. I hope it can help to Turkish developers who use 
 lucene(i searched many hours for a turkish analyzer for lucene but couldnt 
 find, so i coded and sending it here.)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-101) Selecting a language-specific analyzer according to a locale.

2006-06-15 Thread Daniel Naber (JIRA)
[ 
http://issues.apache.org/jira/browse/LUCENE-101?page=comments#action_12416410 ] 

Daniel Naber commented on LUCENE-101:
-

The URL from the original report doesn't work anymore, I think it refers to 
this post:

http://mail-archives.apache.org/mod_mbox/lucene-java-dev/200205.mbox/20020522.153124.14421363.kazama%40ingrid.org

I guess this report can be closed?


 Selecting a language-specific analyzer according to a locale.
 -

  Key: LUCENE-101
  URL: http://issues.apache.org/jira/browse/LUCENE-101
  Project: Lucene - Java
 Type: Improvement

   Components: Analysis
 Versions: unspecified
  Environment: Operating System: other
 Platform: Other
 Reporter: Eric Isakson
 Priority: Minor


 Moved from todo.xml:
 Now we rewrite parts of Lucene code in order to use another analyzer. It will 
 be useful to select analyzer without touching code.
 This was orginally request by Kazuhiro Kazama ([EMAIL PROTECTED]) in
 http://nagoya.apache.org/eyebrowse/ReadMsg?listName=lucene-
 [EMAIL PROTECTED]msgId=338928
 Not sure if this was completed to Kazuhiro Kazama's satisfaction in the 
 current 
 CVS. We can certainly choose which analyzer to use for a given IndexWriter 
 and 
 QueryParser it sounded like he was asking for something like a factory the 
 would create an analyzer based on a locale but unless I don't understand 
 things 
 quite right, searching an index with any analyzer that you didn't create the 
 index with is bound to cause you to have false hits in your results.
 Perhaps this is fixed or no action should be taken. Can someone with a better 
 understanding of the request comment on this one or close it out?

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-259) HTML Parser doesn't decode character references in attributes

2006-06-15 Thread Daniel Naber (JIRA)
 [ http://issues.apache.org/jira/browse/LUCENE-259?page=all ]

Daniel Naber updated LUCENE-259:


Bugzilla Id:   (was: 30621)
  Assign To: (was: Lucene Developers)
   Priority: Minor  (was: Major)

Decrease priority because this affects the demo only.


 HTML Parser doesn't decode character references in attributes
 -

  Key: LUCENE-259
  URL: http://issues.apache.org/jira/browse/LUCENE-259
  Project: Lucene - Java
 Type: Bug

   Components: Examples
 Versions: 1.4
  Environment: Operating System: All
 Platform: All
 Reporter: Dave Sparks
 Priority: Minor


 The HTML Parser includes the values of certain attributes in the summary, the
 metaTags and the output stream.  Character references in the attribute values
 are not decoded.  Specifically:
 1. The value of the alt= attribute of an img ... tag is included in the
 summary and the output stream.  This value is case-significant, and may 
 include
 character references.  The character references are not decoded.
 2. The value of the content= attribute of a meta ... tag is included in the
 metaTags if the tag also has a name= or http-equiv= attribute.  This value is
 case-significant, and may include character references.  The character
 references are not decoded, and the value is downcased (since the fix to bug
 #27423).
 I've patched our version of the parser to decode the character references, by
 adding a decodeAll method to Entities to parse a String for character 
 references
 and return a String where the references have been replaced by the 
 corresponding
 characters (or the original String, if no change is needed).  This method is
 called to decode alt= attributes and content= attributes.  I've removed the
 .toLowerCase() on the content= value.  I'm not really happy with this fix, as 
 it
 seems to me to be wrong to parse a value which was previously parsed as a 
 single
 token; there ought to be a way to get it right the first time.
 I've left the name= and http-equiv= values alone.  It's not entirely clear (to
 me) whether character references are allowed, and it would be perverse to use
 them here.  I also appreciate the convenience of having a single combined
 namespace, with downcased names, even though this is technically wrong.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Closed: (LUCENE-587) Explanation.toHtml outputs invalid HTML

2006-06-04 Thread Daniel Naber (JIRA)
 [ http://issues.apache.org/jira/browse/LUCENE-587?page=all ]
 
Daniel Naber closed LUCENE-587:
---

Resolution: Fixed

Sorry, I must have looked at the wrong output. You're right, it seems to be 
okay now.


 Explanation.toHtml outputs invalid HTML
 ---

  Key: LUCENE-587
  URL: http://issues.apache.org/jira/browse/LUCENE-587
  Project: Lucene - Java
 Type: Bug

   Components: Search
 Versions: 2.0.0
 Reporter: Trejkaz
 Assignee: Hoss Man


 If you want an HTML representation of an Explanation, you might call the 
 toHtml() method.  However, the output of this method looks like the following:
 ul
   lisome value = some description/li
   ul
 lisome nested value = some description/li
   /ul
 /ul
 As it is illegal in HTML to nest a UL directly inside a UL, this method will 
 always output unparseable HTML if there are nested explanations.
 What Lucene probably means to output is the following, which is valid HTML:
 ul
   lisome value = some description
 ul
   lisome nested value = some description/li
 /ul
   /li
 /ul

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



  1   2   >