date:20130109

[jira] [Created] (LUCENE-4668) Fix classpaths in classification module

2013-01-09 Thread Tommaso Teofili (JIRA)

Tommaso Teofili created LUCENE-4668:
---

 Summary: Fix classpaths in classification module
 Key: LUCENE-4668
 URL: https://issues.apache.org/jira/browse/LUCENE-4668
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/classification
Reporter: Tommaso Teofili
Assignee: Tommaso Teofili
Priority: Minor
 Fix For: 5.0


Classpaths in lucene/classification/build.xml are not using / extending 
correctly the default base classpaths.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4668) Fix classpaths in classification module

2013-01-09 Thread Tommaso Teofili (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili resolved LUCENE-4668.
-

Resolution: Fixed

 Fix classpaths in classification module
 ---

 Key: LUCENE-4668
 URL: https://issues.apache.org/jira/browse/LUCENE-4668
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/classification
Reporter: Tommaso Teofili
Assignee: Tommaso Teofili
Priority: Minor
 Fix For: 5.0


 Classpaths in lucene/classification/build.xml are not using / extending 
 correctly the default base classpaths.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4668) Fix classpaths in classification module

2013-01-09 Thread Commit Tag Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13547757#comment-13547757
 ] 

Commit Tag Bot commented on LUCENE-4668:


[trunk commit] Tommaso Teofili
http://svn.apache.org/viewvc?view=revisionrevision=1430725

[LUCENE-4668] - fixed classification classpaths


 Fix classpaths in classification module
 ---

 Key: LUCENE-4668
 URL: https://issues.apache.org/jira/browse/LUCENE-4668
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/classification
Reporter: Tommaso Teofili
Assignee: Tommaso Teofili
Priority: Minor
 Fix For: 5.0


 Classpaths in lucene/classification/build.xml are not using / extending 
 correctly the default base classpaths.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-1227) NGramTokenizer to handle more than 1024 chars

2013-01-09 Thread Harald Wellmann (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13547774#comment-13547774
 ] 

Harald Wellmann commented on LUCENE-1227:
-

As long as this issue is not fixed, please mention the 1024 character 
truncation in the Javadoc.

The combination of KeywordTokenizer and NGramTokenFilter does not scale well 
for large inputs, as KeywordTokenizer reads the entire input stream into a 
character buffer.

 NGramTokenizer to handle more than 1024 chars
 -

 Key: LUCENE-1227
 URL: https://issues.apache.org/jira/browse/LUCENE-1227
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Reporter: Hiroaki Kawai
Priority: Minor
 Attachments: LUCENE-1227.patch, NGramTokenizer.patch, 
 NGramTokenizer.patch


 Current NGramTokenizer can't handle character stream that is longer than 
 1024. This is too short for non-whitespace-separated languages.
 I created a patch for this issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4666) Simplify CompressingStoredFieldsFormat merging

2013-01-09 Thread Commit Tag Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13547821#comment-13547821
 ] 

Commit Tag Bot commented on LUCENE-4666:


[trunk commit] Adrien Grand
http://svn.apache.org/viewvc?view=revisionrevision=1430755

LUCENE-4666: Simplify CompressingStoredFieldsFormat merging.



 Simplify CompressingStoredFieldsFormat merging
 --

 Key: LUCENE-4666
 URL: https://issues.apache.org/jira/browse/LUCENE-4666
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4666.patch


 Merging is currently unnecessarily complex: it tries to compute the size of 
 the compressed block by analyzing the compressed stream although it could use 
 the fields index instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4666) Simplify CompressingStoredFieldsFormat merging

2013-01-09 Thread Adrien Grand (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand resolved LUCENE-4666.
--

Resolution: Fixed

 Simplify CompressingStoredFieldsFormat merging
 --

 Key: LUCENE-4666
 URL: https://issues.apache.org/jira/browse/LUCENE-4666
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4666.patch


 Merging is currently unnecessarily complex: it tries to compute the size of 
 the compressed block by analyzing the compressed stream although it could use 
 the fields index instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4666) Simplify CompressingStoredFieldsFormat merging

2013-01-09 Thread Commit Tag Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13547827#comment-13547827
 ] 

Commit Tag Bot commented on LUCENE-4666:


[branch_4x commit] Adrien Grand
http://svn.apache.org/viewvc?view=revisionrevision=1430757

LUCENE-4666: Simplify CompressingStoredFieldsFormat merging (merged from 
r1430755).



 Simplify CompressingStoredFieldsFormat merging
 --

 Key: LUCENE-4666
 URL: https://issues.apache.org/jira/browse/LUCENE-4666
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4666.patch


 Merging is currently unnecessarily complex: it tries to compute the size of 
 the compressed block by analyzing the compressed stream although it could use 
 the fields index instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3669) Create a ScriptSearchComponent

2013-01-09 Thread Erik Hatcher (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher updated SOLR-3669:
---

Fix Version/s: (was: 4.1)
   5.0

 Create a ScriptSearchComponent
 --

 Key: SOLR-3669
 URL: https://issues.apache.org/jira/browse/SOLR-3669
 Project: Solr
  Issue Type: Improvement
  Components: SearchComponents - other
Reporter: Erik Hatcher
Assignee: Erik Hatcher
 Fix For: 5.0


 Building on the infrastructure created from SOLR-1725, a 
 ScriptSearchComponent would be a valuable addition to Solr flexibility.
 Performance impact will be a very important factor and need to be measured.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3735) Relocate the example mime-to-extension mapping

2013-01-09 Thread Erik Hatcher (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher updated SOLR-3735:
---

Fix Version/s: (was: 4.1)

decided not to bother with this for 4.x, just trunk for now.

 Relocate the example mime-to-extension mapping
 --

 Key: SOLR-3735
 URL: https://issues.apache.org/jira/browse/SOLR-3735
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Affects Versions: 4.0-BETA, 4.0
Reporter: Erik Hatcher
Assignee: Erik Hatcher
Priority: Minor
 Fix For: 5.0

 Attachments: SOLR-3735.patch


 A mime-to-extension mapping was added to VelocityResponseWriter recently.  
 This really belongs in the templates themselves, not in VrW, as it is 
 specific to the example search results not meant for all VrW templates.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-3735) Relocate the example mime-to-extension mapping

2013-01-09 Thread Erik Hatcher (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher resolved SOLR-3735.


Resolution: Fixed

 Relocate the example mime-to-extension mapping
 --

 Key: SOLR-3735
 URL: https://issues.apache.org/jira/browse/SOLR-3735
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Affects Versions: 4.0-BETA, 4.0
Reporter: Erik Hatcher
Assignee: Erik Hatcher
Priority: Minor
 Fix For: 5.0

 Attachments: SOLR-3735.patch


 A mime-to-extension mapping was added to VelocityResponseWriter recently.  
 This really belongs in the templates themselves, not in VrW, as it is 
 specific to the example search results not meant for all VrW templates.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3551) View of analysis output using all field types at once

2013-01-09 Thread Erik Hatcher (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher updated SOLR-3551:
---

Fix Version/s: (was: 4.1)
   5.0

 View of analysis output using all field types at once
 -

 Key: SOLR-3551
 URL: https://issues.apache.org/jira/browse/SOLR-3551
 Project: Solr
  Issue Type: Improvement
Reporter: Erik Hatcher
Assignee: Erik Hatcher
Priority: Trivial
 Fix For: 5.0

 Attachments: allyzer.html, allyzer.vm, analysis.vm


 To demonstrate all field types analyzing the same text for a presentation, I 
 developed a Velocity view that leverages /analysis/field.  Perhaps we could 
 incorporate this into Solr's example or admin somehow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3719) Add instant search capability to /browse

2013-01-09 Thread Erik Hatcher (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher updated SOLR-3719:
---

Fix Version/s: (was: 4.1)
   5.0

 Add instant search capability to /browse
 --

 Key: SOLR-3719
 URL: https://issues.apache.org/jira/browse/SOLR-3719
 Project: Solr
  Issue Type: New Feature
Reporter: Erik Hatcher
Assignee: Erik Hatcher
Priority: Minor
 Fix For: 5.0


 Once upon a time I tinkered with this in a personal github fork 
 https://github.com/erikhatcher/lucene-solr/commits/instant_search/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-839) XML Query Parser support

2013-01-09 Thread Erik Hatcher (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher updated SOLR-839:
--

Fix Version/s: (was: 4.1)

 XML Query Parser support
 

 Key: SOLR-839
 URL: https://issues.apache.org/jira/browse/SOLR-839
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Affects Versions: 1.3
Reporter: Erik Hatcher
Assignee: Erik Hatcher
 Fix For: 5.0

 Attachments: lucene-xml-query-parser-2.4-dev.jar, SOLR-839.patch


 Lucene contrib includes a query parser that is able to create the 
 full-spectrum of Lucene queries, using an XML data structure.
 This patch adds xml query parser support to Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4667) Change TestRandomChains to replace the list of broken classes by a list of broken constructors

2013-01-09 Thread Adrien Grand (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-4667:
-

Attachment: LUCENE-4667.patch

Patch.

 Change TestRandomChains to replace the list of broken classes by a list of 
 broken constructors
 --

 Key: LUCENE-4667
 URL: https://issues.apache.org/jira/browse/LUCENE-4667
 Project: Lucene - Core
  Issue Type: Task
Reporter: Adrien Grand
Priority: Minor
 Attachments: LUCENE-4667.patch


 Some classes are currently in the list of bad apples although only one 
 constructor is broken. For example, LimitTokenCountFilter has an option to 
 consume the whole stream.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-874) Dismax parser exceptions on trailing OPERATOR

2013-01-09 Thread Erik Hatcher (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher updated SOLR-874:
--

Fix Version/s: (was: 4.1)
   5.0
 Assignee: (was: Erik Hatcher)

I started to dig into this for 4.1, but it's hairier than I thought with edge 
cases that need to be accounted for.  Moving this to 5.0 since I won't have 
time to make deal with this for 4.1, sorry.

 Dismax parser exceptions on trailing OPERATOR
 -

 Key: SOLR-874
 URL: https://issues.apache.org/jira/browse/SOLR-874
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 1.3
Reporter: Erik Hatcher
 Fix For: 5.0

 Attachments: SOLR-874-1.3.patch, SOLR-874-1.4.1.patch, SOLR-874.patch


 Dismax is supposed to be immune to parse exceptions, but alas it's not:
 http://localhost:8983/solr/select?defType=dismaxqf=nameq=ipod+AND
 kaboom!
 Caused by: org.apache.lucene.queryParser.ParseException: Cannot parse 'ipod 
 AND': Encountered EOF at line 1, column 8.
 Was expecting one of:
 NOT ...
 + ...
 - ...
 ( ...
 * ...
 QUOTED ...
 TERM ...
 PREFIXTERM ...
 WILDTERM ...
 [ ...
 { ...
 NUMBER ...
 TERM ...
 * ...
 
   at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:175)
   at 
 org.apache.solr.search.DismaxQParser.parse(DisMaxQParserPlugin.java:138)
   at org.apache.solr.search.QParser.getQuery(QParser.java:88)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2440) Schema Browser more user friendly

2013-01-09 Thread Joan Codina (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548411#comment-13548411
 ] 

Joan Codina commented on SOLR-2440:
---

Yes,  you are right, the query part was just a way to find some realtionships 
between words /facets... at the premiliminary stages of indexing and checking 
the data


 

 Schema Browser more user friendly
 -

 Key: SOLR-2440
 URL: https://issues.apache.org/jira/browse/SOLR-2440
 Project: Solr
  Issue Type: New Feature
  Components: web gui
Affects Versions: 1.4.1
 Environment: The schema browser of the admin web application
Reporter: Joan Codina
Priority: Minor
  Labels: browser, schema
 Fix For: 4.2, 5.0

 Attachments: LUCENE_4_schema_jsp.patch, LUCENE_4_screen_css.patch, 
 schema_jsp.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 The schema browser has some drawbacks
 * Does not sort the fields (the actual sorting seems arbritrary)
 * Capitalises all field names. Making difficult the match
 * Does not allow a drill down
 This small patch solves the three issues: 
 #  Changes the Css to do not capitalise the links
 #  Sorts the field names
 #  It replaces the tokens by links to a search query with that token
 that's all  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2470) velocity response writer needs test

2013-01-09 Thread Erik Hatcher (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher updated SOLR-2470:
---

Issue Type: Task  (was: Test)

 velocity response writer needs test
 ---

 Key: SOLR-2470
 URL: https://issues.apache.org/jira/browse/SOLR-2470
 Project: Solr
  Issue Type: Task
Reporter: Yonik Seeley
Assignee: Erik Hatcher

 /browse was broken w/o anyone realizing... we should have a basic test for it

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-874) Dismax parser exceptions on trailing OPERATOR

2013-01-09 Thread Erik Hatcher (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548410#comment-13548410
 ] 

Erik Hatcher edited comment on SOLR-874 at 1/9/13 12:07 PM:


I started to dig into this for 4.1, but it's hairier than I thought with edge 
cases that need to be accounted for.  Moving this to 5.0 since I won't have 
time to deal with this for 4.1, sorry.

  was (Author: ehatcher):
I started to dig into this for 4.1, but it's hairier than I thought with 
edge cases that need to be accounted for.  Moving this to 5.0 since I won't 
have time to make deal with this for 4.1, sorry.
  
 Dismax parser exceptions on trailing OPERATOR
 -

 Key: SOLR-874
 URL: https://issues.apache.org/jira/browse/SOLR-874
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 1.3
Reporter: Erik Hatcher
 Fix For: 5.0

 Attachments: SOLR-874-1.3.patch, SOLR-874-1.4.1.patch, SOLR-874.patch


 Dismax is supposed to be immune to parse exceptions, but alas it's not:
 http://localhost:8983/solr/select?defType=dismaxqf=nameq=ipod+AND
 kaboom!
 Caused by: org.apache.lucene.queryParser.ParseException: Cannot parse 'ipod 
 AND': Encountered EOF at line 1, column 8.
 Was expecting one of:
 NOT ...
 + ...
 - ...
 ( ...
 * ...
 QUOTED ...
 TERM ...
 PREFIXTERM ...
 WILDTERM ...
 [ ...
 { ...
 NUMBER ...
 TERM ...
 * ...
 
   at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:175)
   at 
 org.apache.solr.search.DismaxQParser.parse(DisMaxQParserPlugin.java:138)
   at org.apache.solr.search.QParser.getQuery(QParser.java:88)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4667) Change TestRandomChains to replace the list of broken classes by a list of broken constructors

2013-01-09 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548457#comment-13548457
 ] 

Uwe Schindler commented on LUCENE-4667:
---

Looks fine. I would prfer to use IdentityHashMap instead of HashMap, so it is 
consistent with the remaining logic. Classes and Constructors should be 
compared with identity. I would also make all constructors in the Map with the 
ALWAYS predicate to be not added to the array lists from the beginning.

 Change TestRandomChains to replace the list of broken classes by a list of 
 broken constructors
 --

 Key: LUCENE-4667
 URL: https://issues.apache.org/jira/browse/LUCENE-4667
 Project: Lucene - Core
  Issue Type: Task
Reporter: Adrien Grand
Priority: Minor
 Attachments: LUCENE-4667.patch


 Some classes are currently in the list of bad apples although only one 
 constructor is broken. For example, LimitTokenCountFilter has an option to 
 consume the whole stream.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4624) Compare Lucene memory estimator with terracota's

2013-01-09 Thread Dawid Weiss (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-4624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dawid Weiss updated LUCENE-4624:

Description:
Alex Snaps informed me that there's a sizeof estimator in terracota --

http://svn.terracotta.org/svn/ehcache/trunk/ehcache/ehcache-core/src/main/java/net/sf/ehcache/pool/sizeof/

looks interesting, they have some VM-specific methods. Didn't look too deeply
though; if somebody has the time to check out the differences and maybe compare
the estimation differences it'd be nice.

There is also another tool by Aleksey Shipilev. It looks very good to me
(Aleksey has deep knowledge of JVM internals).

was:
Alex Snaps informed me that there's a sizeof estimator in terracota --

http://svn.terracotta.org/svn/ehcache/trunk/ehcache/ehcache-core/src/main/java/net/sf/ehcache/pool/sizeof/

looks interesting, they have some VM-specific methods. Didn't look too deeply
though; if somebody has the time to check out the differences and maybe compare
the estimation differences it'd be nice.

Compare Lucene memory estimator with terracota's

Key: LUCENE-4624
URL: https://issues.apache.org/jira/browse/LUCENE-4624
Project: Lucene - Core
Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor

Alex Snaps informed me that there's a sizeof estimator in terracota --
http://svn.terracotta.org/svn/ehcache/trunk/ehcache/ehcache-core/src/main/java/net/sf/ehcache/pool/sizeof/
looks interesting, they have some VM-specific methods. Didn't look too deeply
though; if somebody has the time to check out the differences and maybe
compare the estimation differences it'd be nice.
There is also another tool by Aleksey Shipilev. It looks very good to me
(Aleksey has deep knowledge of JVM internals).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4624) Compare Lucene memory estimator with terracota's

2013-01-09 Thread Dawid Weiss (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-4624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dawid Weiss updated LUCENE-4624:

Description:
Alex Snaps informed me that there's a sizeof estimator in terracota --

http://svn.terracotta.org/svn/ehcache/trunk/ehcache/ehcache-core/src/main/java/net/sf/ehcache/pool/sizeof/

looks interesting, they have some VM-specific methods. Didn't look too deeply
though; if somebody has the time to check out the differences and maybe compare
the estimation differences it'd be nice.

There is also another tool by Aleksey Shipilev. It looks very good to me
(Aleksey has deep knowledge of JVM internals).
https://github.com/shipilev/java-object-layout/

was:
Alex Snaps informed me that there's a sizeof estimator in terracota --

http://svn.terracotta.org/svn/ehcache/trunk/ehcache/ehcache-core/src/main/java/net/sf/ehcache/pool/sizeof/

looks interesting, they have some VM-specific methods. Didn't look too deeply
though; if somebody has the time to check out the differences and maybe compare
the estimation differences it'd be nice.

There is also another tool by Aleksey Shipilev. It looks very good to me
(Aleksey has deep knowledge of JVM internals).

Compare Lucene memory estimator with terracota's

Key: LUCENE-4624
URL: https://issues.apache.org/jira/browse/LUCENE-4624
Project: Lucene - Core
Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3178) Native MMapDir

2013-01-09 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548466#comment-13548466
 ] 

Michael McCandless commented on LUCENE-3178:


I haven't looked closely at the patch, but I ran an initial perf test:
{noformat}
TaskQPS base  StdDevQPS comp  StdDev
Pct diff
  AndHighLow 1024.41  (3.1%)  856.52  (2.0%)  
-16.4% ( -20% -  -11%)
   LowPhrase   69.04  (1.7%)   58.90  (0.9%)  
-14.7% ( -16% -  -12%)
  AndHighMed  193.16  (1.0%)  169.24  (1.4%)  
-12.4% ( -14% -  -10%)
 Respell   55.65  (3.0%)   50.01  (3.3%)  
-10.1% ( -15% -   -3%)
  Fuzzy2   67.18  (3.3%)   60.52  (3.6%)   
-9.9% ( -16% -   -3%)
  Fuzzy1   68.83  (3.4%)   62.65  (3.4%)   
-9.0% ( -15% -   -2%)
 LowSloppyPhrase   85.35  (1.8%)   78.64  (1.6%)   
-7.9% ( -11% -   -4%)
 LowSpanNear   38.05  (2.9%)   35.14  (3.1%)   
-7.6% ( -13% -   -1%)
Wildcard   99.78  (3.0%)   93.39  (2.9%)   
-6.4% ( -12% -0%)
 MedSpanNear   77.91  (2.2%)   74.26  (2.3%)   
-4.7% (  -9% -0%)
HighSpanNear9.24  (2.7%)8.86  (2.5%)   
-4.1% (  -9% -1%)
HighSloppyPhrase2.25  (4.0%)2.16  (3.8%)   
-4.0% ( -11% -3%)
 MedSloppyPhrase   78.44  (2.2%)   75.35  (2.4%)   
-3.9% (  -8% -0%)
  HighPhrase   30.39  (8.1%)   29.27  (7.9%)   
-3.7% ( -18% -   13%)
 LowTerm  808.93  (5.0%)  779.29  (5.4%)   
-3.7% ( -13% -7%)
   MedPhrase  176.20  (5.9%)  169.98  (5.5%)   
-3.5% ( -14% -8%)
 Prefix3   51.16  (6.0%)   49.53  (4.9%)   
-3.2% ( -13% -8%)
 AndHighHigh   69.32  (2.3%)   67.21  (2.4%)   
-3.0% (  -7% -1%)
  IntNRQ   10.99 (10.0%)   10.86  (9.0%)   
-1.2% ( -18% -   19%)
 MedTerm  329.36 (10.0%)  325.83 (11.9%)   
-1.1% ( -20% -   23%)
   OrHighMed   67.18  (2.2%)   66.64  (4.5%)   
-0.8% (  -7% -6%)
  OrHighHigh   42.91  (2.5%)   42.59  (4.8%)   
-0.7% (  -7% -6%)
   OrHighLow   62.96  (2.3%)   62.58  (4.9%)   
-0.6% (  -7% -6%)
HighTerm  120.76 (11.6%)  121.21 (14.9%)
0.4% ( -23% -   30%)
{noformat}

This is a hot test, with 10M no-stopwords English Wikipedia.  Baseline is 
normal MMapDir and comp is NativePosixMMapDirectory.  Not sure why some queries 
are slower ...

 Native MMapDir
 --

 Key: LUCENE-3178
 URL: https://issues.apache.org/jira/browse/LUCENE-3178
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/store
Reporter: Michael McCandless
  Labels: gsoc2012, lucene-gsoc-12
 Attachments: LUCENE-3178-Native-MMap-implementation.patch, 
 LUCENE-3178-Native-MMap-implementation.patch, 
 LUCENE-3178-Native-MMap-implementation.patch


 Spinoff from LUCENE-2793.
 Just like we will create native Dir impl (UnixDirectory) to pass the right OS 
 level IO flags depending on the IOContext, we could in theory do something 
 similar with MMapDir.
 The problem is MMap is apparently quite hairy... and to pass the flags the 
 native code would need to invoke mmap (I think?), unlike UnixDir where the 
 code only has to open the file handle.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4667) Change TestRandomChains to replace the list of broken classes by a list of broken constructors

2013-01-09 Thread Adrien Grand (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548505#comment-13548505
 ] 

Adrien Grand commented on LUCENE-4667:
--

The test failed when I used an IdentityHashMap. Did I miss something or can't 
constructors be compared using ==?

 Change TestRandomChains to replace the list of broken classes by a list of 
 broken constructors
 --

 Key: LUCENE-4667
 URL: https://issues.apache.org/jira/browse/LUCENE-4667
 Project: Lucene - Core
  Issue Type: Task
Reporter: Adrien Grand
Priority: Minor
 Attachments: LUCENE-4667.patch


 Some classes are currently in the list of bad apples although only one 
 constructor is broken. For example, LimitTokenCountFilter has an option to 
 consume the whole stream.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4667) Change TestRandomChains to replace the list of broken classes by a list of broken constructors

2013-01-09 Thread Adrien Grand (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-4667:
-

Attachment: LUCENE-4667.patch

New patch that adds exceptions to TrimFilter and TypeTokenFilter as well and 
uses a constructor map for all components, following Uwe's advice.

 Change TestRandomChains to replace the list of broken classes by a list of 
 broken constructors
 --

 Key: LUCENE-4667
 URL: https://issues.apache.org/jira/browse/LUCENE-4667
 Project: Lucene - Core
  Issue Type: Task
Reporter: Adrien Grand
Priority: Minor
 Attachments: LUCENE-4667.patch, LUCENE-4667.patch


 Some classes are currently in the list of bad apples although only one 
 constructor is broken. For example, LimitTokenCountFilter has an option to 
 consume the whole stream.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4667) Change TestRandomChains to replace the list of broken classes by a list of broken constructors

2013-01-09 Thread Uwe Schindler (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548518#comment-13548518
]

Uwe Schindler commented on LUCENE-4667:
---

Maybe that's the case! Sorry. I was expecting that constructors are singletons
like classes. HashMap is fine then.

In my opinion, I think maybe the whole Predicate approach is too much detailed?
I would just match on the constructor itsself and would disallow it completeley
(without looking into actual parameters). Just exclude the constructor in the
beforeClass() method when populating the lists.

If you want to keep the predicate approach, i would exclude all broken
construcors with the ALWAYS predicate in beforeClass(), so it never tries to
use the constructor at all (because its no longer in the list).

Change TestRandomChains to replace the list of broken classes by a list of
broken constructors
--

Key: LUCENE-4667
URL: https://issues.apache.org/jira/browse/LUCENE-4667
Project: Lucene - Core
Issue Type: Task
Reporter: Adrien Grand
Priority: Minor
Attachments: LUCENE-4667.patch, LUCENE-4667.patch

Some classes are currently in the list of bad apples although only one
constructor is broken. For example, LimitTokenCountFilter has an option to
consume the whole stream.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3705) hl.alternateField does not support glob

2013-01-09 Thread JIRA


[ 
https://issues.apache.org/jira/browse/SOLR-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548520#comment-13548520
 ] 

Jan Høydahl commented on SOLR-3705:
---

Hi, do you have a patch for this, ref discussion on solr-user today?

Supporting a comma separated list of alternateField would be around the same 
code lines as supporting GLOB, so maybe we can bake both into the same patch?

 hl.alternateField does not support glob
 ---

 Key: SOLR-3705
 URL: https://issues.apache.org/jira/browse/SOLR-3705
 Project: Solr
  Issue Type: Improvement
  Components: highlighter
Affects Versions: 4.0-ALPHA
Reporter: Markus Jelsma
Priority: Minor
 Fix For: 5.0


 Unlike hl.fl, the hl.alternateField does not support * to match field globs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (LUCENE-4667) Change TestRandomChains to replace the list of broken classes by a list of broken constructors

2013-01-09 Thread Adrien Grand (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand reassigned LUCENE-4667:


Assignee: Adrien Grand

 Change TestRandomChains to replace the list of broken classes by a list of 
 broken constructors
 --

 Key: LUCENE-4667
 URL: https://issues.apache.org/jira/browse/LUCENE-4667
 Project: Lucene - Core
  Issue Type: Task
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Attachments: LUCENE-4667.patch, LUCENE-4667.patch


 Some classes are currently in the list of bad apples although only one 
 constructor is broken. For example, LimitTokenCountFilter has an option to 
 consume the whole stream.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4667) Change TestRandomChains to replace the list of broken classes by a list of broken constructors

2013-01-09 Thread Adrien Grand (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Adrien Grand updated LUCENE-4667:
-

Attachment: LUCENE-4667.patch

bq. Maybe that's the case! Sorry. I was expecting that constructors are
singletons like classes.

No problem, I had the same expectation and was a little disappointed to see
that it didn't work!

bq. I think maybe the whole Predicate approach is too much detailed?

I think it's worth exluding with a predicate: for example this allows to test
random chains with LimitTokenCountFilter(consumeAllTokens=true) (when
consumeAllTokens=false, this filter is broken).

bq. I would exclude all broken construcors with the ALWAYS predicate in
beforeClass()

Sounds good, I updated the patch.

Change TestRandomChains to replace the list of broken classes by a list of
broken constructors
--

Key: LUCENE-4667
URL: https://issues.apache.org/jira/browse/LUCENE-4667
Project: Lucene - Core
Issue Type: Task
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
Attachments: LUCENE-4667.patch, LUCENE-4667.patch, LUCENE-4667.patch

Some classes are currently in the list of bad apples although only one
constructor is broken. For example, LimitTokenCountFilter has an option to
consume the whole stream.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-4288) FileDataSource with an empty basePath and a relative resource is broken.

2013-01-09 Thread Dawid Weiss (JIRA)

Dawid Weiss created SOLR-4288:
-

 Summary: FileDataSource with an empty basePath and a relative 
resource is broken.
 Key: SOLR-4288
 URL: https://issues.apache.org/jira/browse/SOLR-4288
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Dawid Weiss
Priority: Minor
 Fix For: 4.1, 5.0


In fact, the logic is broken:
{code}
  if (!file.isAbsolute())
file = new File(basePath + query);
{code}
because basePath is null so 'null' is concatenated with the query string (path) 
resulting in an invalid path. 

It should be checked if basePath is null, if so default to .? Then resolve 
relative location as:

{code}
new File(basePathFile, query);
{code}

I'd also say change the log so that the absolute path is also logged in the 
warning message, otherwise it's really hard to figure out what's going on.




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3705) hl.alternateField does not support glob

2013-01-09 Thread Markus Jelsma (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-3705:


Attachment: SOLR-3705-trunk-1.patch

Patch adding glob support to the hl.alternateField parameter.

This patch also contains the fix for: SOLR-4089.

 hl.alternateField does not support glob
 ---

 Key: SOLR-3705
 URL: https://issues.apache.org/jira/browse/SOLR-3705
 Project: Solr
  Issue Type: Improvement
  Components: highlighter
Affects Versions: 4.0-ALPHA
Reporter: Markus Jelsma
Priority: Minor
 Fix For: 5.0

 Attachments: SOLR-3705-trunk-1.patch


 Unlike hl.fl, the hl.alternateField does not support * to match field globs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3705) hl.alternateField does not support glob

2013-01-09 Thread JIRA


[ 
https://issues.apache.org/jira/browse/SOLR-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548540#comment-13548540
 ] 

Jan Høydahl commented on SOLR-3705:
---

Great, this is something to continue working on.

 hl.alternateField does not support glob
 ---

 Key: SOLR-3705
 URL: https://issues.apache.org/jira/browse/SOLR-3705
 Project: Solr
  Issue Type: Improvement
  Components: highlighter
Affects Versions: 4.0-ALPHA
Reporter: Markus Jelsma
Priority: Minor
 Fix For: 5.0

 Attachments: SOLR-3705-trunk-1.patch


 Unlike hl.fl, the hl.alternateField does not support * to match field globs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3705) hl.alternateField does not support glob

2013-01-09 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548546#comment-13548546
 ] 

Markus Jelsma commented on SOLR-3705:
-

Thanks Jan!

 hl.alternateField does not support glob
 ---

 Key: SOLR-3705
 URL: https://issues.apache.org/jira/browse/SOLR-3705
 Project: Solr
  Issue Type: Improvement
  Components: highlighter
Affects Versions: 4.0-ALPHA
Reporter: Markus Jelsma
Priority: Minor
 Fix For: 5.0

 Attachments: SOLR-3705-trunk-1.patch


 Unlike hl.fl, the hl.alternateField does not support * to match field globs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4669) Document wrongly deleted from index

2013-01-09 Thread Miguel Ferreira (JIRA)

Miguel Ferreira created LUCENE-4669:
---

 Summary: Document wrongly deleted from index
 Key: LUCENE-4669
 URL: https://issues.apache.org/jira/browse/LUCENE-4669
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.0
 Environment: OS = Mac OS X 10.7.5
Java = JVM 1.6
Reporter: Miguel Ferreira


I'm trying to implement document deletion from an index.
If I create an index with three documents (A, B and C) and then try to delete 
A, A gets marked as deleted but C is removed from the index. I've tried this 
with different number of documents and saw that it is always the last document 
that is removed.

Example unit test:
{code:title=ExampleUnitTest.java}
@Test
public void delete() throws Exception {
File indexDir = FileUtils.createTempDir();

IndexWriter writer = new IndexWriter(new NIOFSDirectory(indexDir), new 
IndexWriterConfig(Version.LUCENE_40,
new StandardAnalyzer(Version.LUCENE_40)));
Document doc = new Document();
String fieldName = path;
doc.add(new StringField(fieldName, a, Store.YES));
writer.addDocument(doc);
doc = new Document();
doc.add(new StringField(fieldName, b, Store.YES));
writer.addDocument(doc);
doc = new Document();
doc.add(new StringField(fieldName, c, Store.YES));
writer.addDocument(doc);
writer.commit();

System.out.println(Before delete);
print(indexDir);

writer.deleteDocuments(new Term(fieldName, a));
writer.commit();

System.out.println(After delete);
print(indexDir);

}

public static void print(File indexDirectory) throws IOException {
DirectoryReader reader = DirectoryReader.open(new 
NIOFSDirectory(indexDirectory));
Bits liveDocs = MultiFields.getLiveDocs(reader);
int numDocs = reader.numDocs();
System.out.println(Found  + numDocs +  documents);
for (int i = 0; i  numDocs; i++) {
Document document = reader.document(i);
StringBuffer sb = new StringBuffer();
sb.append(Document at = ).append(i);
sb.append(; isDeleted = ).append(liveDocs != null ? 
!liveDocs.get(i) : false).append(; );
for (IndexableField field : document.getFields()) {
String fieldName = field.name();
for (String value : document.getValues(fieldName)) {
sb.append(fieldName).append( = ).append(value).append(; 
);
}
}
System.out.println(sb.toString());
}
}
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4669) Document wrongly deleted from index

2013-01-09 Thread Miguel Ferreira (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miguel Ferreira updated LUCENE-4669:


Description: 
I'm trying to implement document deletion from an index.
If I create an index with three documents (A, B and C) and then try to delete 
A, A gets marked as deleted but C is removed from the index. I've tried this 
with different number of documents and saw that it is always the last document 
that is removed.

When I run the example unit test code bellow I get this output:
{code}
Before delete
Found 3 documents
Document at = 0; isDeleted = false; path = a; 
Document at = 1; isDeleted = false; path = b; 
Document at = 2; isDeleted = false; path = c; 
After delete
Found 2 documents
Document at = 0; isDeleted = true; path = a; 
Document at = 1; isDeleted = false; path = b; 
{code}

Example unit test:
{code:title=ExampleUnitTest.java}
@Test
public void delete() throws Exception {
File indexDir = FileUtils.createTempDir();

IndexWriter writer = new IndexWriter(new NIOFSDirectory(indexDir), new 
IndexWriterConfig(Version.LUCENE_40,
new StandardAnalyzer(Version.LUCENE_40)));
Document doc = new Document();
String fieldName = path;
doc.add(new StringField(fieldName, a, Store.YES));
writer.addDocument(doc);
doc = new Document();
doc.add(new StringField(fieldName, b, Store.YES));
writer.addDocument(doc);
doc = new Document();
doc.add(new StringField(fieldName, c, Store.YES));
writer.addDocument(doc);
writer.commit();

System.out.println(Before delete);
print(indexDir);

writer.deleteDocuments(new Term(fieldName, a));
writer.commit();

System.out.println(After delete);
print(indexDir);

}

public static void print(File indexDirectory) throws IOException {
DirectoryReader reader = DirectoryReader.open(new 
NIOFSDirectory(indexDirectory));
Bits liveDocs = MultiFields.getLiveDocs(reader);
int numDocs = reader.numDocs();
System.out.println(Found  + numDocs +  documents);
for (int i = 0; i  numDocs; i++) {
Document document = reader.document(i);
StringBuffer sb = new StringBuffer();
sb.append(Document at = ).append(i);
sb.append(; isDeleted = ).append(liveDocs != null ? 
!liveDocs.get(i) : false).append(; );
for (IndexableField field : document.getFields()) {
String fieldName = field.name();
for (String value : document.getValues(fieldName)) {
sb.append(fieldName).append( = ).append(value).append(; 
);
}
}
System.out.println(sb.toString());
}
}
{code}

  was:
I'm trying to implement document deletion from an index.
If I create an index with three documents (A, B and C) and then try to delete 
A, A gets marked as deleted but C is removed from the index. I've tried this 
with different number of documents and saw that it is always the last document 
that is removed.

Example unit test:
{code:title=ExampleUnitTest.java}
@Test
public void delete() throws Exception {
File indexDir = FileUtils.createTempDir();

IndexWriter writer = new IndexWriter(new NIOFSDirectory(indexDir), new 
IndexWriterConfig(Version.LUCENE_40,
new StandardAnalyzer(Version.LUCENE_40)));
Document doc = new Document();
String fieldName = path;
doc.add(new StringField(fieldName, a, Store.YES));
writer.addDocument(doc);
doc = new Document();
doc.add(new StringField(fieldName, b, Store.YES));
writer.addDocument(doc);
doc = new Document();
doc.add(new StringField(fieldName, c, Store.YES));
writer.addDocument(doc);
writer.commit();

System.out.println(Before delete);
print(indexDir);

writer.deleteDocuments(new Term(fieldName, a));
writer.commit();

System.out.println(After delete);
print(indexDir);

}

public static void print(File indexDirectory) throws IOException {
DirectoryReader reader = DirectoryReader.open(new 
NIOFSDirectory(indexDirectory));
Bits liveDocs = MultiFields.getLiveDocs(reader);
int numDocs = reader.numDocs();
System.out.println(Found  + numDocs +  documents);
for (int i = 0; i  numDocs; i++) {
Document document = reader.document(i);
StringBuffer sb = new StringBuffer();
sb.append(Document at = ).append(i);
sb.append(; isDeleted = ).append(liveDocs != null ? 
!liveDocs.get(i) : false).append(; );
for (IndexableField field : document.getFields()) {
String fieldName = field.name();
for (String value : document.getValues(fieldName)) {

1 2 >

1 - 100 of 126 matches

Mail list logo