[jira] [Created] (LUCENE-4668) Fix classpaths in classification module
Tommaso Teofili created LUCENE-4668: --- Summary: Fix classpaths in classification module Key: LUCENE-4668 URL: https://issues.apache.org/jira/browse/LUCENE-4668 Project: Lucene - Core Issue Type: Improvement Components: modules/classification Reporter: Tommaso Teofili Assignee: Tommaso Teofili Priority: Minor Fix For: 5.0 Classpaths in lucene/classification/build.xml are not using / extending correctly the default base classpaths. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4668) Fix classpaths in classification module
[ https://issues.apache.org/jira/browse/LUCENE-4668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tommaso Teofili resolved LUCENE-4668. - Resolution: Fixed Fix classpaths in classification module --- Key: LUCENE-4668 URL: https://issues.apache.org/jira/browse/LUCENE-4668 Project: Lucene - Core Issue Type: Improvement Components: modules/classification Reporter: Tommaso Teofili Assignee: Tommaso Teofili Priority: Minor Fix For: 5.0 Classpaths in lucene/classification/build.xml are not using / extending correctly the default base classpaths. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4668) Fix classpaths in classification module
[ https://issues.apache.org/jira/browse/LUCENE-4668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13547757#comment-13547757 ] Commit Tag Bot commented on LUCENE-4668: [trunk commit] Tommaso Teofili http://svn.apache.org/viewvc?view=revisionrevision=1430725 [LUCENE-4668] - fixed classification classpaths Fix classpaths in classification module --- Key: LUCENE-4668 URL: https://issues.apache.org/jira/browse/LUCENE-4668 Project: Lucene - Core Issue Type: Improvement Components: modules/classification Reporter: Tommaso Teofili Assignee: Tommaso Teofili Priority: Minor Fix For: 5.0 Classpaths in lucene/classification/build.xml are not using / extending correctly the default base classpaths. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-1227) NGramTokenizer to handle more than 1024 chars
[ https://issues.apache.org/jira/browse/LUCENE-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13547774#comment-13547774 ] Harald Wellmann commented on LUCENE-1227: - As long as this issue is not fixed, please mention the 1024 character truncation in the Javadoc. The combination of KeywordTokenizer and NGramTokenFilter does not scale well for large inputs, as KeywordTokenizer reads the entire input stream into a character buffer. NGramTokenizer to handle more than 1024 chars - Key: LUCENE-1227 URL: https://issues.apache.org/jira/browse/LUCENE-1227 Project: Lucene - Core Issue Type: Improvement Components: modules/analysis Reporter: Hiroaki Kawai Priority: Minor Attachments: LUCENE-1227.patch, NGramTokenizer.patch, NGramTokenizer.patch Current NGramTokenizer can't handle character stream that is longer than 1024. This is too short for non-whitespace-separated languages. I created a patch for this issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4666) Simplify CompressingStoredFieldsFormat merging
[ https://issues.apache.org/jira/browse/LUCENE-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13547821#comment-13547821 ] Commit Tag Bot commented on LUCENE-4666: [trunk commit] Adrien Grand http://svn.apache.org/viewvc?view=revisionrevision=1430755 LUCENE-4666: Simplify CompressingStoredFieldsFormat merging. Simplify CompressingStoredFieldsFormat merging -- Key: LUCENE-4666 URL: https://issues.apache.org/jira/browse/LUCENE-4666 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.1, 5.0 Attachments: LUCENE-4666.patch Merging is currently unnecessarily complex: it tries to compute the size of the compressed block by analyzing the compressed stream although it could use the fields index instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4666) Simplify CompressingStoredFieldsFormat merging
[ https://issues.apache.org/jira/browse/LUCENE-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-4666. -- Resolution: Fixed Simplify CompressingStoredFieldsFormat merging -- Key: LUCENE-4666 URL: https://issues.apache.org/jira/browse/LUCENE-4666 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.1, 5.0 Attachments: LUCENE-4666.patch Merging is currently unnecessarily complex: it tries to compute the size of the compressed block by analyzing the compressed stream although it could use the fields index instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4666) Simplify CompressingStoredFieldsFormat merging
[ https://issues.apache.org/jira/browse/LUCENE-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13547827#comment-13547827 ] Commit Tag Bot commented on LUCENE-4666: [branch_4x commit] Adrien Grand http://svn.apache.org/viewvc?view=revisionrevision=1430757 LUCENE-4666: Simplify CompressingStoredFieldsFormat merging (merged from r1430755). Simplify CompressingStoredFieldsFormat merging -- Key: LUCENE-4666 URL: https://issues.apache.org/jira/browse/LUCENE-4666 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.1, 5.0 Attachments: LUCENE-4666.patch Merging is currently unnecessarily complex: it tries to compute the size of the compressed block by analyzing the compressed stream although it could use the fields index instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3669) Create a ScriptSearchComponent
[ https://issues.apache.org/jira/browse/SOLR-3669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Hatcher updated SOLR-3669: --- Fix Version/s: (was: 4.1) 5.0 Create a ScriptSearchComponent -- Key: SOLR-3669 URL: https://issues.apache.org/jira/browse/SOLR-3669 Project: Solr Issue Type: Improvement Components: SearchComponents - other Reporter: Erik Hatcher Assignee: Erik Hatcher Fix For: 5.0 Building on the infrastructure created from SOLR-1725, a ScriptSearchComponent would be a valuable addition to Solr flexibility. Performance impact will be a very important factor and need to be measured. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3735) Relocate the example mime-to-extension mapping
[ https://issues.apache.org/jira/browse/SOLR-3735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Hatcher updated SOLR-3735: --- Fix Version/s: (was: 4.1) decided not to bother with this for 4.x, just trunk for now. Relocate the example mime-to-extension mapping -- Key: SOLR-3735 URL: https://issues.apache.org/jira/browse/SOLR-3735 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 4.0-BETA, 4.0 Reporter: Erik Hatcher Assignee: Erik Hatcher Priority: Minor Fix For: 5.0 Attachments: SOLR-3735.patch A mime-to-extension mapping was added to VelocityResponseWriter recently. This really belongs in the templates themselves, not in VrW, as it is specific to the example search results not meant for all VrW templates. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-3735) Relocate the example mime-to-extension mapping
[ https://issues.apache.org/jira/browse/SOLR-3735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Hatcher resolved SOLR-3735. Resolution: Fixed Relocate the example mime-to-extension mapping -- Key: SOLR-3735 URL: https://issues.apache.org/jira/browse/SOLR-3735 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 4.0-BETA, 4.0 Reporter: Erik Hatcher Assignee: Erik Hatcher Priority: Minor Fix For: 5.0 Attachments: SOLR-3735.patch A mime-to-extension mapping was added to VelocityResponseWriter recently. This really belongs in the templates themselves, not in VrW, as it is specific to the example search results not meant for all VrW templates. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3551) View of analysis output using all field types at once
[ https://issues.apache.org/jira/browse/SOLR-3551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Hatcher updated SOLR-3551: --- Fix Version/s: (was: 4.1) 5.0 View of analysis output using all field types at once - Key: SOLR-3551 URL: https://issues.apache.org/jira/browse/SOLR-3551 Project: Solr Issue Type: Improvement Reporter: Erik Hatcher Assignee: Erik Hatcher Priority: Trivial Fix For: 5.0 Attachments: allyzer.html, allyzer.vm, analysis.vm To demonstrate all field types analyzing the same text for a presentation, I developed a Velocity view that leverages /analysis/field. Perhaps we could incorporate this into Solr's example or admin somehow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3719) Add instant search capability to /browse
[ https://issues.apache.org/jira/browse/SOLR-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Hatcher updated SOLR-3719: --- Fix Version/s: (was: 4.1) 5.0 Add instant search capability to /browse -- Key: SOLR-3719 URL: https://issues.apache.org/jira/browse/SOLR-3719 Project: Solr Issue Type: New Feature Reporter: Erik Hatcher Assignee: Erik Hatcher Priority: Minor Fix For: 5.0 Once upon a time I tinkered with this in a personal github fork https://github.com/erikhatcher/lucene-solr/commits/instant_search/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-839) XML Query Parser support
[ https://issues.apache.org/jira/browse/SOLR-839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Hatcher updated SOLR-839: -- Fix Version/s: (was: 4.1) XML Query Parser support Key: SOLR-839 URL: https://issues.apache.org/jira/browse/SOLR-839 Project: Solr Issue Type: New Feature Components: query parsers Affects Versions: 1.3 Reporter: Erik Hatcher Assignee: Erik Hatcher Fix For: 5.0 Attachments: lucene-xml-query-parser-2.4-dev.jar, SOLR-839.patch Lucene contrib includes a query parser that is able to create the full-spectrum of Lucene queries, using an XML data structure. This patch adds xml query parser support to Solr. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4667) Change TestRandomChains to replace the list of broken classes by a list of broken constructors
[ https://issues.apache.org/jira/browse/LUCENE-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-4667: - Attachment: LUCENE-4667.patch Patch. Change TestRandomChains to replace the list of broken classes by a list of broken constructors -- Key: LUCENE-4667 URL: https://issues.apache.org/jira/browse/LUCENE-4667 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Priority: Minor Attachments: LUCENE-4667.patch Some classes are currently in the list of bad apples although only one constructor is broken. For example, LimitTokenCountFilter has an option to consume the whole stream. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-874) Dismax parser exceptions on trailing OPERATOR
[ https://issues.apache.org/jira/browse/SOLR-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Hatcher updated SOLR-874: -- Fix Version/s: (was: 4.1) 5.0 Assignee: (was: Erik Hatcher) I started to dig into this for 4.1, but it's hairier than I thought with edge cases that need to be accounted for. Moving this to 5.0 since I won't have time to make deal with this for 4.1, sorry. Dismax parser exceptions on trailing OPERATOR - Key: SOLR-874 URL: https://issues.apache.org/jira/browse/SOLR-874 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 1.3 Reporter: Erik Hatcher Fix For: 5.0 Attachments: SOLR-874-1.3.patch, SOLR-874-1.4.1.patch, SOLR-874.patch Dismax is supposed to be immune to parse exceptions, but alas it's not: http://localhost:8983/solr/select?defType=dismaxqf=nameq=ipod+AND kaboom! Caused by: org.apache.lucene.queryParser.ParseException: Cannot parse 'ipod AND': Encountered EOF at line 1, column 8. Was expecting one of: NOT ... + ... - ... ( ... * ... QUOTED ... TERM ... PREFIXTERM ... WILDTERM ... [ ... { ... NUMBER ... TERM ... * ... at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:175) at org.apache.solr.search.DismaxQParser.parse(DisMaxQParserPlugin.java:138) at org.apache.solr.search.QParser.getQuery(QParser.java:88) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2440) Schema Browser more user friendly
[ https://issues.apache.org/jira/browse/SOLR-2440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548411#comment-13548411 ] Joan Codina commented on SOLR-2440: --- Yes, you are right, the query part was just a way to find some realtionships between words /facets... at the premiliminary stages of indexing and checking the data Schema Browser more user friendly - Key: SOLR-2440 URL: https://issues.apache.org/jira/browse/SOLR-2440 Project: Solr Issue Type: New Feature Components: web gui Affects Versions: 1.4.1 Environment: The schema browser of the admin web application Reporter: Joan Codina Priority: Minor Labels: browser, schema Fix For: 4.2, 5.0 Attachments: LUCENE_4_schema_jsp.patch, LUCENE_4_screen_css.patch, schema_jsp.patch Original Estimate: 1h Remaining Estimate: 1h The schema browser has some drawbacks * Does not sort the fields (the actual sorting seems arbritrary) * Capitalises all field names. Making difficult the match * Does not allow a drill down This small patch solves the three issues: # Changes the Css to do not capitalise the links # Sorts the field names # It replaces the tokens by links to a search query with that token that's all -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2470) velocity response writer needs test
[ https://issues.apache.org/jira/browse/SOLR-2470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Hatcher updated SOLR-2470: --- Issue Type: Task (was: Test) velocity response writer needs test --- Key: SOLR-2470 URL: https://issues.apache.org/jira/browse/SOLR-2470 Project: Solr Issue Type: Task Reporter: Yonik Seeley Assignee: Erik Hatcher /browse was broken w/o anyone realizing... we should have a basic test for it -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-874) Dismax parser exceptions on trailing OPERATOR
[ https://issues.apache.org/jira/browse/SOLR-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548410#comment-13548410 ] Erik Hatcher edited comment on SOLR-874 at 1/9/13 12:07 PM: I started to dig into this for 4.1, but it's hairier than I thought with edge cases that need to be accounted for. Moving this to 5.0 since I won't have time to deal with this for 4.1, sorry. was (Author: ehatcher): I started to dig into this for 4.1, but it's hairier than I thought with edge cases that need to be accounted for. Moving this to 5.0 since I won't have time to make deal with this for 4.1, sorry. Dismax parser exceptions on trailing OPERATOR - Key: SOLR-874 URL: https://issues.apache.org/jira/browse/SOLR-874 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 1.3 Reporter: Erik Hatcher Fix For: 5.0 Attachments: SOLR-874-1.3.patch, SOLR-874-1.4.1.patch, SOLR-874.patch Dismax is supposed to be immune to parse exceptions, but alas it's not: http://localhost:8983/solr/select?defType=dismaxqf=nameq=ipod+AND kaboom! Caused by: org.apache.lucene.queryParser.ParseException: Cannot parse 'ipod AND': Encountered EOF at line 1, column 8. Was expecting one of: NOT ... + ... - ... ( ... * ... QUOTED ... TERM ... PREFIXTERM ... WILDTERM ... [ ... { ... NUMBER ... TERM ... * ... at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:175) at org.apache.solr.search.DismaxQParser.parse(DisMaxQParserPlugin.java:138) at org.apache.solr.search.QParser.getQuery(QParser.java:88) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4667) Change TestRandomChains to replace the list of broken classes by a list of broken constructors
[ https://issues.apache.org/jira/browse/LUCENE-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548457#comment-13548457 ] Uwe Schindler commented on LUCENE-4667: --- Looks fine. I would prfer to use IdentityHashMap instead of HashMap, so it is consistent with the remaining logic. Classes and Constructors should be compared with identity. I would also make all constructors in the Map with the ALWAYS predicate to be not added to the array lists from the beginning. Change TestRandomChains to replace the list of broken classes by a list of broken constructors -- Key: LUCENE-4667 URL: https://issues.apache.org/jira/browse/LUCENE-4667 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Priority: Minor Attachments: LUCENE-4667.patch Some classes are currently in the list of bad apples although only one constructor is broken. For example, LimitTokenCountFilter has an option to consume the whole stream. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4624) Compare Lucene memory estimator with terracota's
[ https://issues.apache.org/jira/browse/LUCENE-4624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated LUCENE-4624: Description: Alex Snaps informed me that there's a sizeof estimator in terracota -- http://svn.terracotta.org/svn/ehcache/trunk/ehcache/ehcache-core/src/main/java/net/sf/ehcache/pool/sizeof/ looks interesting, they have some VM-specific methods. Didn't look too deeply though; if somebody has the time to check out the differences and maybe compare the estimation differences it'd be nice. There is also another tool by Aleksey Shipilev. It looks very good to me (Aleksey has deep knowledge of JVM internals). was: Alex Snaps informed me that there's a sizeof estimator in terracota -- http://svn.terracotta.org/svn/ehcache/trunk/ehcache/ehcache-core/src/main/java/net/sf/ehcache/pool/sizeof/ looks interesting, they have some VM-specific methods. Didn't look too deeply though; if somebody has the time to check out the differences and maybe compare the estimation differences it'd be nice. Compare Lucene memory estimator with terracota's Key: LUCENE-4624 URL: https://issues.apache.org/jira/browse/LUCENE-4624 Project: Lucene - Core Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Minor Alex Snaps informed me that there's a sizeof estimator in terracota -- http://svn.terracotta.org/svn/ehcache/trunk/ehcache/ehcache-core/src/main/java/net/sf/ehcache/pool/sizeof/ looks interesting, they have some VM-specific methods. Didn't look too deeply though; if somebody has the time to check out the differences and maybe compare the estimation differences it'd be nice. There is also another tool by Aleksey Shipilev. It looks very good to me (Aleksey has deep knowledge of JVM internals). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4624) Compare Lucene memory estimator with terracota's
[ https://issues.apache.org/jira/browse/LUCENE-4624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated LUCENE-4624: Description: Alex Snaps informed me that there's a sizeof estimator in terracota -- http://svn.terracotta.org/svn/ehcache/trunk/ehcache/ehcache-core/src/main/java/net/sf/ehcache/pool/sizeof/ looks interesting, they have some VM-specific methods. Didn't look too deeply though; if somebody has the time to check out the differences and maybe compare the estimation differences it'd be nice. There is also another tool by Aleksey Shipilev. It looks very good to me (Aleksey has deep knowledge of JVM internals). https://github.com/shipilev/java-object-layout/ was: Alex Snaps informed me that there's a sizeof estimator in terracota -- http://svn.terracotta.org/svn/ehcache/trunk/ehcache/ehcache-core/src/main/java/net/sf/ehcache/pool/sizeof/ looks interesting, they have some VM-specific methods. Didn't look too deeply though; if somebody has the time to check out the differences and maybe compare the estimation differences it'd be nice. There is also another tool by Aleksey Shipilev. It looks very good to me (Aleksey has deep knowledge of JVM internals). Compare Lucene memory estimator with terracota's Key: LUCENE-4624 URL: https://issues.apache.org/jira/browse/LUCENE-4624 Project: Lucene - Core Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Minor Alex Snaps informed me that there's a sizeof estimator in terracota -- http://svn.terracotta.org/svn/ehcache/trunk/ehcache/ehcache-core/src/main/java/net/sf/ehcache/pool/sizeof/ looks interesting, they have some VM-specific methods. Didn't look too deeply though; if somebody has the time to check out the differences and maybe compare the estimation differences it'd be nice. There is also another tool by Aleksey Shipilev. It looks very good to me (Aleksey has deep knowledge of JVM internals). https://github.com/shipilev/java-object-layout/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3178) Native MMapDir
[ https://issues.apache.org/jira/browse/LUCENE-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548466#comment-13548466 ] Michael McCandless commented on LUCENE-3178: I haven't looked closely at the patch, but I ran an initial perf test: {noformat} TaskQPS base StdDevQPS comp StdDev Pct diff AndHighLow 1024.41 (3.1%) 856.52 (2.0%) -16.4% ( -20% - -11%) LowPhrase 69.04 (1.7%) 58.90 (0.9%) -14.7% ( -16% - -12%) AndHighMed 193.16 (1.0%) 169.24 (1.4%) -12.4% ( -14% - -10%) Respell 55.65 (3.0%) 50.01 (3.3%) -10.1% ( -15% - -3%) Fuzzy2 67.18 (3.3%) 60.52 (3.6%) -9.9% ( -16% - -3%) Fuzzy1 68.83 (3.4%) 62.65 (3.4%) -9.0% ( -15% - -2%) LowSloppyPhrase 85.35 (1.8%) 78.64 (1.6%) -7.9% ( -11% - -4%) LowSpanNear 38.05 (2.9%) 35.14 (3.1%) -7.6% ( -13% - -1%) Wildcard 99.78 (3.0%) 93.39 (2.9%) -6.4% ( -12% -0%) MedSpanNear 77.91 (2.2%) 74.26 (2.3%) -4.7% ( -9% -0%) HighSpanNear9.24 (2.7%)8.86 (2.5%) -4.1% ( -9% -1%) HighSloppyPhrase2.25 (4.0%)2.16 (3.8%) -4.0% ( -11% -3%) MedSloppyPhrase 78.44 (2.2%) 75.35 (2.4%) -3.9% ( -8% -0%) HighPhrase 30.39 (8.1%) 29.27 (7.9%) -3.7% ( -18% - 13%) LowTerm 808.93 (5.0%) 779.29 (5.4%) -3.7% ( -13% -7%) MedPhrase 176.20 (5.9%) 169.98 (5.5%) -3.5% ( -14% -8%) Prefix3 51.16 (6.0%) 49.53 (4.9%) -3.2% ( -13% -8%) AndHighHigh 69.32 (2.3%) 67.21 (2.4%) -3.0% ( -7% -1%) IntNRQ 10.99 (10.0%) 10.86 (9.0%) -1.2% ( -18% - 19%) MedTerm 329.36 (10.0%) 325.83 (11.9%) -1.1% ( -20% - 23%) OrHighMed 67.18 (2.2%) 66.64 (4.5%) -0.8% ( -7% -6%) OrHighHigh 42.91 (2.5%) 42.59 (4.8%) -0.7% ( -7% -6%) OrHighLow 62.96 (2.3%) 62.58 (4.9%) -0.6% ( -7% -6%) HighTerm 120.76 (11.6%) 121.21 (14.9%) 0.4% ( -23% - 30%) {noformat} This is a hot test, with 10M no-stopwords English Wikipedia. Baseline is normal MMapDir and comp is NativePosixMMapDirectory. Not sure why some queries are slower ... Native MMapDir -- Key: LUCENE-3178 URL: https://issues.apache.org/jira/browse/LUCENE-3178 Project: Lucene - Core Issue Type: Improvement Components: core/store Reporter: Michael McCandless Labels: gsoc2012, lucene-gsoc-12 Attachments: LUCENE-3178-Native-MMap-implementation.patch, LUCENE-3178-Native-MMap-implementation.patch, LUCENE-3178-Native-MMap-implementation.patch Spinoff from LUCENE-2793. Just like we will create native Dir impl (UnixDirectory) to pass the right OS level IO flags depending on the IOContext, we could in theory do something similar with MMapDir. The problem is MMap is apparently quite hairy... and to pass the flags the native code would need to invoke mmap (I think?), unlike UnixDir where the code only has to open the file handle. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4667) Change TestRandomChains to replace the list of broken classes by a list of broken constructors
[ https://issues.apache.org/jira/browse/LUCENE-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548505#comment-13548505 ] Adrien Grand commented on LUCENE-4667: -- The test failed when I used an IdentityHashMap. Did I miss something or can't constructors be compared using ==? Change TestRandomChains to replace the list of broken classes by a list of broken constructors -- Key: LUCENE-4667 URL: https://issues.apache.org/jira/browse/LUCENE-4667 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Priority: Minor Attachments: LUCENE-4667.patch Some classes are currently in the list of bad apples although only one constructor is broken. For example, LimitTokenCountFilter has an option to consume the whole stream. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4667) Change TestRandomChains to replace the list of broken classes by a list of broken constructors
[ https://issues.apache.org/jira/browse/LUCENE-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-4667: - Attachment: LUCENE-4667.patch New patch that adds exceptions to TrimFilter and TypeTokenFilter as well and uses a constructor map for all components, following Uwe's advice. Change TestRandomChains to replace the list of broken classes by a list of broken constructors -- Key: LUCENE-4667 URL: https://issues.apache.org/jira/browse/LUCENE-4667 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Priority: Minor Attachments: LUCENE-4667.patch, LUCENE-4667.patch Some classes are currently in the list of bad apples although only one constructor is broken. For example, LimitTokenCountFilter has an option to consume the whole stream. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4667) Change TestRandomChains to replace the list of broken classes by a list of broken constructors
[ https://issues.apache.org/jira/browse/LUCENE-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548518#comment-13548518 ] Uwe Schindler commented on LUCENE-4667: --- Maybe that's the case! Sorry. I was expecting that constructors are singletons like classes. HashMap is fine then. In my opinion, I think maybe the whole Predicate approach is too much detailed? I would just match on the constructor itsself and would disallow it completeley (without looking into actual parameters). Just exclude the constructor in the beforeClass() method when populating the lists. If you want to keep the predicate approach, i would exclude all broken construcors with the ALWAYS predicate in beforeClass(), so it never tries to use the constructor at all (because its no longer in the list). Change TestRandomChains to replace the list of broken classes by a list of broken constructors -- Key: LUCENE-4667 URL: https://issues.apache.org/jira/browse/LUCENE-4667 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Priority: Minor Attachments: LUCENE-4667.patch, LUCENE-4667.patch Some classes are currently in the list of bad apples although only one constructor is broken. For example, LimitTokenCountFilter has an option to consume the whole stream. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3705) hl.alternateField does not support glob
[ https://issues.apache.org/jira/browse/SOLR-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548520#comment-13548520 ] Jan Høydahl commented on SOLR-3705: --- Hi, do you have a patch for this, ref discussion on solr-user today? Supporting a comma separated list of alternateField would be around the same code lines as supporting GLOB, so maybe we can bake both into the same patch? hl.alternateField does not support glob --- Key: SOLR-3705 URL: https://issues.apache.org/jira/browse/SOLR-3705 Project: Solr Issue Type: Improvement Components: highlighter Affects Versions: 4.0-ALPHA Reporter: Markus Jelsma Priority: Minor Fix For: 5.0 Unlike hl.fl, the hl.alternateField does not support * to match field globs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-4667) Change TestRandomChains to replace the list of broken classes by a list of broken constructors
[ https://issues.apache.org/jira/browse/LUCENE-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand reassigned LUCENE-4667: Assignee: Adrien Grand Change TestRandomChains to replace the list of broken classes by a list of broken constructors -- Key: LUCENE-4667 URL: https://issues.apache.org/jira/browse/LUCENE-4667 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-4667.patch, LUCENE-4667.patch Some classes are currently in the list of bad apples although only one constructor is broken. For example, LimitTokenCountFilter has an option to consume the whole stream. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4667) Change TestRandomChains to replace the list of broken classes by a list of broken constructors
[ https://issues.apache.org/jira/browse/LUCENE-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-4667: - Attachment: LUCENE-4667.patch bq. Maybe that's the case! Sorry. I was expecting that constructors are singletons like classes. No problem, I had the same expectation and was a little disappointed to see that it didn't work! bq. I think maybe the whole Predicate approach is too much detailed? I think it's worth exluding with a predicate: for example this allows to test random chains with LimitTokenCountFilter(consumeAllTokens=true) (when consumeAllTokens=false, this filter is broken). bq. I would exclude all broken construcors with the ALWAYS predicate in beforeClass() Sounds good, I updated the patch. Change TestRandomChains to replace the list of broken classes by a list of broken constructors -- Key: LUCENE-4667 URL: https://issues.apache.org/jira/browse/LUCENE-4667 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-4667.patch, LUCENE-4667.patch, LUCENE-4667.patch Some classes are currently in the list of bad apples although only one constructor is broken. For example, LimitTokenCountFilter has an option to consume the whole stream. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4288) FileDataSource with an empty basePath and a relative resource is broken.
Dawid Weiss created SOLR-4288: - Summary: FileDataSource with an empty basePath and a relative resource is broken. Key: SOLR-4288 URL: https://issues.apache.org/jira/browse/SOLR-4288 Project: Solr Issue Type: Bug Affects Versions: 4.0 Reporter: Dawid Weiss Priority: Minor Fix For: 4.1, 5.0 In fact, the logic is broken: {code} if (!file.isAbsolute()) file = new File(basePath + query); {code} because basePath is null so 'null' is concatenated with the query string (path) resulting in an invalid path. It should be checked if basePath is null, if so default to .? Then resolve relative location as: {code} new File(basePathFile, query); {code} I'd also say change the log so that the absolute path is also logged in the warning message, otherwise it's really hard to figure out what's going on. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3705) hl.alternateField does not support glob
[ https://issues.apache.org/jira/browse/SOLR-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated SOLR-3705: Attachment: SOLR-3705-trunk-1.patch Patch adding glob support to the hl.alternateField parameter. This patch also contains the fix for: SOLR-4089. hl.alternateField does not support glob --- Key: SOLR-3705 URL: https://issues.apache.org/jira/browse/SOLR-3705 Project: Solr Issue Type: Improvement Components: highlighter Affects Versions: 4.0-ALPHA Reporter: Markus Jelsma Priority: Minor Fix For: 5.0 Attachments: SOLR-3705-trunk-1.patch Unlike hl.fl, the hl.alternateField does not support * to match field globs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3705) hl.alternateField does not support glob
[ https://issues.apache.org/jira/browse/SOLR-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548540#comment-13548540 ] Jan Høydahl commented on SOLR-3705: --- Great, this is something to continue working on. hl.alternateField does not support glob --- Key: SOLR-3705 URL: https://issues.apache.org/jira/browse/SOLR-3705 Project: Solr Issue Type: Improvement Components: highlighter Affects Versions: 4.0-ALPHA Reporter: Markus Jelsma Priority: Minor Fix For: 5.0 Attachments: SOLR-3705-trunk-1.patch Unlike hl.fl, the hl.alternateField does not support * to match field globs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3705) hl.alternateField does not support glob
[ https://issues.apache.org/jira/browse/SOLR-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548546#comment-13548546 ] Markus Jelsma commented on SOLR-3705: - Thanks Jan! hl.alternateField does not support glob --- Key: SOLR-3705 URL: https://issues.apache.org/jira/browse/SOLR-3705 Project: Solr Issue Type: Improvement Components: highlighter Affects Versions: 4.0-ALPHA Reporter: Markus Jelsma Priority: Minor Fix For: 5.0 Attachments: SOLR-3705-trunk-1.patch Unlike hl.fl, the hl.alternateField does not support * to match field globs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4669) Document wrongly deleted from index
Miguel Ferreira created LUCENE-4669: --- Summary: Document wrongly deleted from index Key: LUCENE-4669 URL: https://issues.apache.org/jira/browse/LUCENE-4669 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.0 Environment: OS = Mac OS X 10.7.5 Java = JVM 1.6 Reporter: Miguel Ferreira I'm trying to implement document deletion from an index. If I create an index with three documents (A, B and C) and then try to delete A, A gets marked as deleted but C is removed from the index. I've tried this with different number of documents and saw that it is always the last document that is removed. Example unit test: {code:title=ExampleUnitTest.java} @Test public void delete() throws Exception { File indexDir = FileUtils.createTempDir(); IndexWriter writer = new IndexWriter(new NIOFSDirectory(indexDir), new IndexWriterConfig(Version.LUCENE_40, new StandardAnalyzer(Version.LUCENE_40))); Document doc = new Document(); String fieldName = path; doc.add(new StringField(fieldName, a, Store.YES)); writer.addDocument(doc); doc = new Document(); doc.add(new StringField(fieldName, b, Store.YES)); writer.addDocument(doc); doc = new Document(); doc.add(new StringField(fieldName, c, Store.YES)); writer.addDocument(doc); writer.commit(); System.out.println(Before delete); print(indexDir); writer.deleteDocuments(new Term(fieldName, a)); writer.commit(); System.out.println(After delete); print(indexDir); } public static void print(File indexDirectory) throws IOException { DirectoryReader reader = DirectoryReader.open(new NIOFSDirectory(indexDirectory)); Bits liveDocs = MultiFields.getLiveDocs(reader); int numDocs = reader.numDocs(); System.out.println(Found + numDocs + documents); for (int i = 0; i numDocs; i++) { Document document = reader.document(i); StringBuffer sb = new StringBuffer(); sb.append(Document at = ).append(i); sb.append(; isDeleted = ).append(liveDocs != null ? !liveDocs.get(i) : false).append(; ); for (IndexableField field : document.getFields()) { String fieldName = field.name(); for (String value : document.getValues(fieldName)) { sb.append(fieldName).append( = ).append(value).append(; ); } } System.out.println(sb.toString()); } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4669) Document wrongly deleted from index
[ https://issues.apache.org/jira/browse/LUCENE-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miguel Ferreira updated LUCENE-4669: Description: I'm trying to implement document deletion from an index. If I create an index with three documents (A, B and C) and then try to delete A, A gets marked as deleted but C is removed from the index. I've tried this with different number of documents and saw that it is always the last document that is removed. When I run the example unit test code bellow I get this output: {code} Before delete Found 3 documents Document at = 0; isDeleted = false; path = a; Document at = 1; isDeleted = false; path = b; Document at = 2; isDeleted = false; path = c; After delete Found 2 documents Document at = 0; isDeleted = true; path = a; Document at = 1; isDeleted = false; path = b; {code} Example unit test: {code:title=ExampleUnitTest.java} @Test public void delete() throws Exception { File indexDir = FileUtils.createTempDir(); IndexWriter writer = new IndexWriter(new NIOFSDirectory(indexDir), new IndexWriterConfig(Version.LUCENE_40, new StandardAnalyzer(Version.LUCENE_40))); Document doc = new Document(); String fieldName = path; doc.add(new StringField(fieldName, a, Store.YES)); writer.addDocument(doc); doc = new Document(); doc.add(new StringField(fieldName, b, Store.YES)); writer.addDocument(doc); doc = new Document(); doc.add(new StringField(fieldName, c, Store.YES)); writer.addDocument(doc); writer.commit(); System.out.println(Before delete); print(indexDir); writer.deleteDocuments(new Term(fieldName, a)); writer.commit(); System.out.println(After delete); print(indexDir); } public static void print(File indexDirectory) throws IOException { DirectoryReader reader = DirectoryReader.open(new NIOFSDirectory(indexDirectory)); Bits liveDocs = MultiFields.getLiveDocs(reader); int numDocs = reader.numDocs(); System.out.println(Found + numDocs + documents); for (int i = 0; i numDocs; i++) { Document document = reader.document(i); StringBuffer sb = new StringBuffer(); sb.append(Document at = ).append(i); sb.append(; isDeleted = ).append(liveDocs != null ? !liveDocs.get(i) : false).append(; ); for (IndexableField field : document.getFields()) { String fieldName = field.name(); for (String value : document.getValues(fieldName)) { sb.append(fieldName).append( = ).append(value).append(; ); } } System.out.println(sb.toString()); } } {code} was: I'm trying to implement document deletion from an index. If I create an index with three documents (A, B and C) and then try to delete A, A gets marked as deleted but C is removed from the index. I've tried this with different number of documents and saw that it is always the last document that is removed. Example unit test: {code:title=ExampleUnitTest.java} @Test public void delete() throws Exception { File indexDir = FileUtils.createTempDir(); IndexWriter writer = new IndexWriter(new NIOFSDirectory(indexDir), new IndexWriterConfig(Version.LUCENE_40, new StandardAnalyzer(Version.LUCENE_40))); Document doc = new Document(); String fieldName = path; doc.add(new StringField(fieldName, a, Store.YES)); writer.addDocument(doc); doc = new Document(); doc.add(new StringField(fieldName, b, Store.YES)); writer.addDocument(doc); doc = new Document(); doc.add(new StringField(fieldName, c, Store.YES)); writer.addDocument(doc); writer.commit(); System.out.println(Before delete); print(indexDir); writer.deleteDocuments(new Term(fieldName, a)); writer.commit(); System.out.println(After delete); print(indexDir); } public static void print(File indexDirectory) throws IOException { DirectoryReader reader = DirectoryReader.open(new NIOFSDirectory(indexDirectory)); Bits liveDocs = MultiFields.getLiveDocs(reader); int numDocs = reader.numDocs(); System.out.println(Found + numDocs + documents); for (int i = 0; i numDocs; i++) { Document document = reader.document(i); StringBuffer sb = new StringBuffer(); sb.append(Document at = ).append(i); sb.append(; isDeleted = ).append(liveDocs != null ? !liveDocs.get(i) : false).append(; ); for (IndexableField field : document.getFields()) { String fieldName = field.name(); for (String value : document.getValues(fieldName)) {
[jira] [Resolved] (LUCENE-4669) Document wrongly deleted from index
[ https://issues.apache.org/jira/browse/LUCENE-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-4669. - Resolution: Not A Problem see the javadocs for numDocs Document wrongly deleted from index --- Key: LUCENE-4669 URL: https://issues.apache.org/jira/browse/LUCENE-4669 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.0 Environment: OS = Mac OS X 10.7.5 Java = JVM 1.6 Reporter: Miguel Ferreira I'm trying to implement document deletion from an index. If I create an index with three documents (A, B and C) and then try to delete A, A gets marked as deleted but C is removed from the index. I've tried this with different number of documents and saw that it is always the last document that is removed. When I run the example unit test code bellow I get this output: {code} Before delete Found 3 documents Document at = 0; isDeleted = false; path = a; Document at = 1; isDeleted = false; path = b; Document at = 2; isDeleted = false; path = c; After delete Found 2 documents Document at = 0; isDeleted = true; path = a; Document at = 1; isDeleted = false; path = b; {code} Example unit test: {code:title=ExampleUnitTest.java} @Test public void delete() throws Exception { File indexDir = FileUtils.createTempDir(); IndexWriter writer = new IndexWriter(new NIOFSDirectory(indexDir), new IndexWriterConfig(Version.LUCENE_40, new StandardAnalyzer(Version.LUCENE_40))); Document doc = new Document(); String fieldName = path; doc.add(new StringField(fieldName, a, Store.YES)); writer.addDocument(doc); doc = new Document(); doc.add(new StringField(fieldName, b, Store.YES)); writer.addDocument(doc); doc = new Document(); doc.add(new StringField(fieldName, c, Store.YES)); writer.addDocument(doc); writer.commit(); System.out.println(Before delete); print(indexDir); writer.deleteDocuments(new Term(fieldName, a)); writer.commit(); System.out.println(After delete); print(indexDir); } public static void print(File indexDirectory) throws IOException { DirectoryReader reader = DirectoryReader.open(new NIOFSDirectory(indexDirectory)); Bits liveDocs = MultiFields.getLiveDocs(reader); int numDocs = reader.numDocs(); System.out.println(Found + numDocs + documents); for (int i = 0; i numDocs; i++) { Document document = reader.document(i); StringBuffer sb = new StringBuffer(); sb.append(Document at = ).append(i); sb.append(; isDeleted = ).append(liveDocs != null ? !liveDocs.get(i) : false).append(; ); for (IndexableField field : document.getFields()) { String fieldName = field.name(); for (String value : document.getValues(fieldName)) { sb.append(fieldName).append( = ).append(value).append(; ); } } System.out.println(sb.toString()); } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4669) Document wrongly deleted from index
[ https://issues.apache.org/jira/browse/LUCENE-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548607#comment-13548607 ] Miguel Ferreira commented on LUCENE-4669: - I've seen the javadocs for numDocs and I still don't get why is it that when I delete de document with field path=a the document with field path=c is removed from the idnex. Do you care to explain? Document wrongly deleted from index --- Key: LUCENE-4669 URL: https://issues.apache.org/jira/browse/LUCENE-4669 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.0 Environment: OS = Mac OS X 10.7.5 Java = JVM 1.6 Reporter: Miguel Ferreira I'm trying to implement document deletion from an index. If I create an index with three documents (A, B and C) and then try to delete A, A gets marked as deleted but C is removed from the index. I've tried this with different number of documents and saw that it is always the last document that is removed. When I run the example unit test code bellow I get this output: {code} Before delete Found 3 documents Document at = 0; isDeleted = false; path = a; Document at = 1; isDeleted = false; path = b; Document at = 2; isDeleted = false; path = c; After delete Found 2 documents Document at = 0; isDeleted = true; path = a; Document at = 1; isDeleted = false; path = b; {code} Example unit test: {code:title=ExampleUnitTest.java} @Test public void delete() throws Exception { File indexDir = FileUtils.createTempDir(); IndexWriter writer = new IndexWriter(new NIOFSDirectory(indexDir), new IndexWriterConfig(Version.LUCENE_40, new StandardAnalyzer(Version.LUCENE_40))); Document doc = new Document(); String fieldName = path; doc.add(new StringField(fieldName, a, Store.YES)); writer.addDocument(doc); doc = new Document(); doc.add(new StringField(fieldName, b, Store.YES)); writer.addDocument(doc); doc = new Document(); doc.add(new StringField(fieldName, c, Store.YES)); writer.addDocument(doc); writer.commit(); System.out.println(Before delete); print(indexDir); writer.deleteDocuments(new Term(fieldName, a)); writer.commit(); System.out.println(After delete); print(indexDir); } public static void print(File indexDirectory) throws IOException { DirectoryReader reader = DirectoryReader.open(new NIOFSDirectory(indexDirectory)); Bits liveDocs = MultiFields.getLiveDocs(reader); int numDocs = reader.numDocs(); System.out.println(Found + numDocs + documents); for (int i = 0; i numDocs; i++) { Document document = reader.document(i); StringBuffer sb = new StringBuffer(); sb.append(Document at = ).append(i); sb.append(; isDeleted = ).append(liveDocs != null ? !liveDocs.get(i) : false).append(; ); for (IndexableField field : document.getFields()) { String fieldName = field.name(); for (String value : document.getValues(fieldName)) { sb.append(fieldName).append( = ).append(value).append(; ); } } System.out.println(sb.toString()); } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-4669) Document wrongly deleted from index
[ https://issues.apache.org/jira/browse/LUCENE-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548607#comment-13548607 ] Miguel Ferreira edited comment on LUCENE-4669 at 1/9/13 4:05 PM: - I've seen the javadocs for numDocs and I still don't get why is it that when I delete de document with field path=a the document with field path=c is removed from the index. Do you care to explain? was (Author: miguelferreira): I've seen the javadocs for numDocs and I still don't get why is it that when I delete de document with field path=a the document with field path=c is removed from the idnex. Do you care to explain? Document wrongly deleted from index --- Key: LUCENE-4669 URL: https://issues.apache.org/jira/browse/LUCENE-4669 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.0 Environment: OS = Mac OS X 10.7.5 Java = JVM 1.6 Reporter: Miguel Ferreira I'm trying to implement document deletion from an index. If I create an index with three documents (A, B and C) and then try to delete A, A gets marked as deleted but C is removed from the index. I've tried this with different number of documents and saw that it is always the last document that is removed. When I run the example unit test code bellow I get this output: {code} Before delete Found 3 documents Document at = 0; isDeleted = false; path = a; Document at = 1; isDeleted = false; path = b; Document at = 2; isDeleted = false; path = c; After delete Found 2 documents Document at = 0; isDeleted = true; path = a; Document at = 1; isDeleted = false; path = b; {code} Example unit test: {code:title=ExampleUnitTest.java} @Test public void delete() throws Exception { File indexDir = FileUtils.createTempDir(); IndexWriter writer = new IndexWriter(new NIOFSDirectory(indexDir), new IndexWriterConfig(Version.LUCENE_40, new StandardAnalyzer(Version.LUCENE_40))); Document doc = new Document(); String fieldName = path; doc.add(new StringField(fieldName, a, Store.YES)); writer.addDocument(doc); doc = new Document(); doc.add(new StringField(fieldName, b, Store.YES)); writer.addDocument(doc); doc = new Document(); doc.add(new StringField(fieldName, c, Store.YES)); writer.addDocument(doc); writer.commit(); System.out.println(Before delete); print(indexDir); writer.deleteDocuments(new Term(fieldName, a)); writer.commit(); System.out.println(After delete); print(indexDir); } public static void print(File indexDirectory) throws IOException { DirectoryReader reader = DirectoryReader.open(new NIOFSDirectory(indexDirectory)); Bits liveDocs = MultiFields.getLiveDocs(reader); int numDocs = reader.numDocs(); System.out.println(Found + numDocs + documents); for (int i = 0; i numDocs; i++) { Document document = reader.document(i); StringBuffer sb = new StringBuffer(); sb.append(Document at = ).append(i); sb.append(; isDeleted = ).append(liveDocs != null ? !liveDocs.get(i) : false).append(; ); for (IndexableField field : document.getFields()) { String fieldName = field.name(); for (String value : document.getValues(fieldName)) { sb.append(fieldName).append( = ).append(value).append(; ); } } System.out.println(sb.toString()); } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4669) Document wrongly deleted from index
[ https://issues.apache.org/jira/browse/LUCENE-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548622#comment-13548622 ] Adrien Grand commented on LUCENE-4669: -- Hi Miguel, c has not been deleted, the problem is that you used IndexReader.numDocs instead of IndexReader.maxDoc. Given that you deleted a document, IndexReader.numDocs decreased from 3 to 2 but c still has docId==2 so your print(File) method doesn't display it. Document wrongly deleted from index --- Key: LUCENE-4669 URL: https://issues.apache.org/jira/browse/LUCENE-4669 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.0 Environment: OS = Mac OS X 10.7.5 Java = JVM 1.6 Reporter: Miguel Ferreira I'm trying to implement document deletion from an index. If I create an index with three documents (A, B and C) and then try to delete A, A gets marked as deleted but C is removed from the index. I've tried this with different number of documents and saw that it is always the last document that is removed. When I run the example unit test code bellow I get this output: {code} Before delete Found 3 documents Document at = 0; isDeleted = false; path = a; Document at = 1; isDeleted = false; path = b; Document at = 2; isDeleted = false; path = c; After delete Found 2 documents Document at = 0; isDeleted = true; path = a; Document at = 1; isDeleted = false; path = b; {code} Example unit test: {code:title=ExampleUnitTest.java} @Test public void delete() throws Exception { File indexDir = FileUtils.createTempDir(); IndexWriter writer = new IndexWriter(new NIOFSDirectory(indexDir), new IndexWriterConfig(Version.LUCENE_40, new StandardAnalyzer(Version.LUCENE_40))); Document doc = new Document(); String fieldName = path; doc.add(new StringField(fieldName, a, Store.YES)); writer.addDocument(doc); doc = new Document(); doc.add(new StringField(fieldName, b, Store.YES)); writer.addDocument(doc); doc = new Document(); doc.add(new StringField(fieldName, c, Store.YES)); writer.addDocument(doc); writer.commit(); System.out.println(Before delete); print(indexDir); writer.deleteDocuments(new Term(fieldName, a)); writer.commit(); System.out.println(After delete); print(indexDir); } public static void print(File indexDirectory) throws IOException { DirectoryReader reader = DirectoryReader.open(new NIOFSDirectory(indexDirectory)); Bits liveDocs = MultiFields.getLiveDocs(reader); int numDocs = reader.numDocs(); System.out.println(Found + numDocs + documents); for (int i = 0; i numDocs; i++) { Document document = reader.document(i); StringBuffer sb = new StringBuffer(); sb.append(Document at = ).append(i); sb.append(; isDeleted = ).append(liveDocs != null ? !liveDocs.get(i) : false).append(; ); for (IndexableField field : document.getFields()) { String fieldName = field.name(); for (String value : document.getValues(fieldName)) { sb.append(fieldName).append( = ).append(value).append(; ); } } System.out.println(sb.toString()); } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4667) Change TestRandomChains to replace the list of broken classes by a list of broken constructors
[ https://issues.apache.org/jira/browse/LUCENE-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548630#comment-13548630 ] Commit Tag Bot commented on LUCENE-4667: [trunk commit] Adrien Grand http://svn.apache.org/viewvc?view=revisionrevision=1430931 LUCENE-4667: Change the broken components list from class-based to constructor-based. TestRandomChains now tests LimitTokenCountFilter and checks that offsets generated with TrimFilter and TypeTokenFilter are correct. Change TestRandomChains to replace the list of broken classes by a list of broken constructors -- Key: LUCENE-4667 URL: https://issues.apache.org/jira/browse/LUCENE-4667 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-4667.patch, LUCENE-4667.patch, LUCENE-4667.patch Some classes are currently in the list of bad apples although only one constructor is broken. For example, LimitTokenCountFilter has an option to consume the whole stream. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4667) Change TestRandomChains to replace the list of broken classes by a list of broken constructors
[ https://issues.apache.org/jira/browse/LUCENE-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-4667. -- Resolution: Fixed Change TestRandomChains to replace the list of broken classes by a list of broken constructors -- Key: LUCENE-4667 URL: https://issues.apache.org/jira/browse/LUCENE-4667 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-4667.patch, LUCENE-4667.patch, LUCENE-4667.patch Some classes are currently in the list of bad apples although only one constructor is broken. For example, LimitTokenCountFilter has an option to consume the whole stream. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4667) Change TestRandomChains to replace the list of broken classes by a list of broken constructors
[ https://issues.apache.org/jira/browse/LUCENE-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548640#comment-13548640 ] Robert Muir commented on LUCENE-4667: - We could also use this to stop some false fails from all the subclasses of FilteringTokenFilter (LengthFilter, TypeFilter, etc) that currently cause failures due to https://issues.apache.org/jira/browse/LUCENE-4065, when enablePositionIncrements=false Change TestRandomChains to replace the list of broken classes by a list of broken constructors -- Key: LUCENE-4667 URL: https://issues.apache.org/jira/browse/LUCENE-4667 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-4667.patch, LUCENE-4667.patch, LUCENE-4667.patch Some classes are currently in the list of bad apples although only one constructor is broken. For example, LimitTokenCountFilter has an option to consume the whole stream. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4667) Change TestRandomChains to replace the list of broken classes by a list of broken constructors
[ https://issues.apache.org/jira/browse/LUCENE-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548643#comment-13548643 ] Commit Tag Bot commented on LUCENE-4667: [branch_4x commit] Adrien Grand http://svn.apache.org/viewvc?view=revisionrevision=1430934 LUCENE-4667: Change the broken components list from class-based to constructor-based (merged from r1430931). TestRandomChains now tests LimitTokenCountFilter and checks that offsets generated with TrimFilter and TypeTokenFilter are correct. Change TestRandomChains to replace the list of broken classes by a list of broken constructors -- Key: LUCENE-4667 URL: https://issues.apache.org/jira/browse/LUCENE-4667 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-4667.patch, LUCENE-4667.patch, LUCENE-4667.patch Some classes are currently in the list of bad apples although only one constructor is broken. For example, LimitTokenCountFilter has an option to consume the whole stream. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4065) FilteringTokenFilter should never corrupt the tokenstream graph
[ https://issues.apache.org/jira/browse/LUCENE-4065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548651#comment-13548651 ] Commit Tag Bot commented on LUCENE-4065: [trunk commit] Robert Muir http://svn.apache.org/viewvc?view=revisionrevision=1430939 LUCENE-4065: shitlist these broken ctors so they dont cause false fails FilteringTokenFilter should never corrupt the tokenstream graph --- Key: LUCENE-4065 URL: https://issues.apache.org/jira/browse/LUCENE-4065 Project: Lucene - Core Issue Type: Bug Components: modules/analysis Reporter: Robert Muir Attachments: LUCENE-4065_test.patch Currently removers like stopfilter have an option (true/false) to enable position increments. If its true: it both inserts gaps where necessary AND propagates gaps down the stream. If its false: it does neither, which can totally mess up the tokenstream graph (e.g. move synonyms to another word). There are totally valid natural usecases for false, where you don't want gaps because you want phrasequeries to act as if the word was never actually there. But 'not inserting gaps' is separate from proper propagation of existing gaps. So I think we should provide an option (either fix 'false' or make it an enum), where you still get a legit tokenstream and dont totally screw it up, but you simply omit gaps. See LUCENE-3848 for more information (Where we at least fixed this case to not begin the tokenstream with posinc=0) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-834) AnalysisRequestHandler UI
[ https://issues.apache.org/jira/browse/SOLR-834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Hatcher resolved SOLR-834. --- Resolution: Won't Fix There's currently no itch to scratch here, so Won't Fix this issue. AnalysisRequestHandler UI - Key: SOLR-834 URL: https://issues.apache.org/jira/browse/SOLR-834 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 1.3 Reporter: Erik Hatcher Assignee: Erik Hatcher Priority: Trivial Attachments: analysis.html The AnalysisRequestHandler currently requires some command-line pain to use, or lots of XML typing in a stream.body URL parameter. This patch is a simple HTML POST front-end to using stream.body with this request handler. Maybe this should be added to Solr admin? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4065) FilteringTokenFilter should never corrupt the tokenstream graph
[ https://issues.apache.org/jira/browse/LUCENE-4065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548662#comment-13548662 ] Commit Tag Bot commented on LUCENE-4065: [branch_4x commit] Robert Muir http://svn.apache.org/viewvc?view=revisionrevision=1430944 LUCENE-4065: shitlist these broken ctors so they dont cause false fails FilteringTokenFilter should never corrupt the tokenstream graph --- Key: LUCENE-4065 URL: https://issues.apache.org/jira/browse/LUCENE-4065 Project: Lucene - Core Issue Type: Bug Components: modules/analysis Reporter: Robert Muir Attachments: LUCENE-4065_test.patch Currently removers like stopfilter have an option (true/false) to enable position increments. If its true: it both inserts gaps where necessary AND propagates gaps down the stream. If its false: it does neither, which can totally mess up the tokenstream graph (e.g. move synonyms to another word). There are totally valid natural usecases for false, where you don't want gaps because you want phrasequeries to act as if the word was never actually there. But 'not inserting gaps' is separate from proper propagation of existing gaps. So I think we should provide an option (either fix 'false' or make it an enum), where you still get a legit tokenstream and dont totally screw it up, but you simply omit gaps. See LUCENE-3848 for more information (Where we at least fixed this case to not begin the tokenstream with posinc=0) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4667) Change TestRandomChains to replace the list of broken classes by a list of broken constructors
[ https://issues.apache.org/jira/browse/LUCENE-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548680#comment-13548680 ] Uwe Schindler commented on LUCENE-4667: --- Thanks Adrien! Change TestRandomChains to replace the list of broken classes by a list of broken constructors -- Key: LUCENE-4667 URL: https://issues.apache.org/jira/browse/LUCENE-4667 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-4667.patch, LUCENE-4667.patch, LUCENE-4667.patch Some classes are currently in the list of bad apples although only one constructor is broken. For example, LimitTokenCountFilter has an option to consume the whole stream. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Javadoc bug: FuzzyQuery - minimumSimilarity = maxEdits
There’s a bug/typo in the following Lucene Javadoc: FuzzyQuery public FuzzyQuery(Term term, int maxEdits, int prefixLength) Calls FuzzyQuery(term, minimumSimilarity, prefixLength, defaultMaxExpansions, defaultTranspositions). The reference to “minimumSimilarity” should be “maxEdits”. -- Jack Krupansky
[jira] [Created] (LUCENE-4670) Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier
Adrien Grand created LUCENE-4670: Summary: Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier Key: LUCENE-4670 URL: https://issues.apache.org/jira/browse/LUCENE-4670 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.1 This is especially useful to LUCENE-4599 where actions have to be taken after a doc/field/term has been added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4670) Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier
[ https://issues.apache.org/jira/browse/LUCENE-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-4670: - Attachment: LUCENE-4670.patch Patch, with new assertions in AssertingCodec to make sure that there is a finish call for every start call. Tests passed with -Dtest.codec=Asserting. Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier -- Key: LUCENE-4670 URL: https://issues.apache.org/jira/browse/LUCENE-4670 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.1 Attachments: LUCENE-4670.patch This is especially useful to LUCENE-4599 where actions have to be taken after a doc/field/term has been added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4670) Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier
[ https://issues.apache.org/jira/browse/LUCENE-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548703#comment-13548703 ] Robert Muir commented on LUCENE-4670: - Wouldnt this be redundant? * startDocument(int numVectorFields -- this tells you how many times startField will be called * startField(FieldInfo info, int numTerms -- this tells you how many times startTerm will be called * startTerm(BytesRef term, int freq -- this tells you how many times addPosition will be called Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier -- Key: LUCENE-4670 URL: https://issues.apache.org/jira/browse/LUCENE-4670 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.1 Attachments: LUCENE-4670.patch This is especially useful to LUCENE-4599 where actions have to be taken after a doc/field/term has been added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4670) Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier
[ https://issues.apache.org/jira/browse/LUCENE-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548709#comment-13548709 ] Adrien Grand commented on LUCENE-4670: -- bq. Wouldnt this be redundant? Yes it is, but I think it can make the format easier to implement and later to understand? PostingsConsumer already does it (startDoc(docID, freq) / finishDoc). Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier -- Key: LUCENE-4670 URL: https://issues.apache.org/jira/browse/LUCENE-4670 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.1 Attachments: LUCENE-4670.patch This is especially useful to LUCENE-4599 where actions have to be taken after a doc/field/term has been added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4670) Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier
[ https://issues.apache.org/jira/browse/LUCENE-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548719#comment-13548719 ] Adrien Grand commented on LUCENE-4670: -- For example, if you want to flush data after every field has been added, today you need to do it both in the finish and startField methods, and in both cases you need to check whether startField had already been called earlier on. By having a finishField method, the modification is in one place and doesn't need an extra condition. Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier -- Key: LUCENE-4670 URL: https://issues.apache.org/jira/browse/LUCENE-4670 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.1 Attachments: LUCENE-4670.patch This is especially useful to LUCENE-4599 where actions have to be taken after a doc/field/term has been added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4670) Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier
[ https://issues.apache.org/jira/browse/LUCENE-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548720#comment-13548720 ] Robert Muir commented on LUCENE-4670: - Terms/PostingsConsumer doesnt do it in general: e.g. startField doesn't tell you the number of terms, and startTerms doesnt tell you the number of documents. so they must have finish() since they are filtering deleted docs on the fly. For the per-document apis (Stored Fields, Term Vectors), we instead give you this number totally up-front (as it makes it easier to e.g. write numTerms into your file). I'm not necessarily opposed to the redundant calls, but it should then also be done with the stored fields api. And i'd like to see if it really simplifies some of our existing impls (SimpleText, Lucene40) as well. Finally, adding checks to AssertingCodec as a test is a good idea, however it still leaves our default merge implementation untested because the wrapped codec implements bulk merge. Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier -- Key: LUCENE-4670 URL: https://issues.apache.org/jira/browse/LUCENE-4670 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.1 Attachments: LUCENE-4670.patch This is especially useful to LUCENE-4599 where actions have to be taken after a doc/field/term has been added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4670) Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier
[ https://issues.apache.org/jira/browse/LUCENE-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548725#comment-13548725 ] Robert Muir commented on LUCENE-4670: - {quote} For example, if you want to flush data after every field has been added, today you need to do it both in the finish and startField methods, and in both cases you need to check whether startField had already been called earlier on. By having a finishField method, the modification is in one place and doesn't need an extra condition. {quote} Yeah I think this currently makes Lucene40's impl confusing too: check out its startField. If we can simplify that one too, i'm completely sold. I still feel like we should do this to the stored fields api too though for consistency. Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier -- Key: LUCENE-4670 URL: https://issues.apache.org/jira/browse/LUCENE-4670 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.1 Attachments: LUCENE-4670.patch This is especially useful to LUCENE-4599 where actions have to be taken after a doc/field/term has been added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4670) Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier
[ https://issues.apache.org/jira/browse/LUCENE-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548737#comment-13548737 ] Robert Muir commented on LUCENE-4670: - and by if we can simplify for the 4.0 codec, I dont necessarily mean we change the code. Its good enough for to stare at it and be able to tell if it would be simpler :) Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier -- Key: LUCENE-4670 URL: https://issues.apache.org/jira/browse/LUCENE-4670 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.1 Attachments: LUCENE-4670.patch This is especially useful to LUCENE-4599 where actions have to be taken after a doc/field/term has been added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4670) Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier
[ https://issues.apache.org/jira/browse/LUCENE-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548755#comment-13548755 ] Adrien Grand commented on LUCENE-4670: -- bq. I still feel like we should do this to the stored fields api too though for consistency. Agreed. bq. Its good enough for to stare at it and be able to tell if it would be simpler I think it would? Things like {code} if (fieldCount == numVectorFields) { // last field of the document // this is crazy because the file format is crazy! for (int i = 1; i fieldCount; i++) { tvd.writeVLong(fps[i] - fps[i-1]); } } {code} in startField could become {code} public void finishDocument() throws IOException { // last field of the document // this is crazy because the file format is crazy! for (int i = 1; i fieldCount; i++) { tvd.writeVLong(fps[i] - fps[i-1]); } } {code} It would help simplify Lucene41StoredFieldsFormat too. bq. it still leaves our default merge implementation untested because the wrapped codec implements bulk merge. Do you have an idea how to test it? Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier -- Key: LUCENE-4670 URL: https://issues.apache.org/jira/browse/LUCENE-4670 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.1 Attachments: LUCENE-4670.patch, LUCENE-4670.patch This is especially useful to LUCENE-4599 where actions have to be taken after a doc/field/term has been added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4670) Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier
[ https://issues.apache.org/jira/browse/LUCENE-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-4670: Attachment: LUCENE-4670.patch i think this looks better? I can go either way on whether or not the finish() methods should be abstract though (versus having a no-op impl). Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier -- Key: LUCENE-4670 URL: https://issues.apache.org/jira/browse/LUCENE-4670 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.1 Attachments: LUCENE-4670.patch, LUCENE-4670.patch This is especially useful to LUCENE-4599 where actions have to be taken after a doc/field/term has been added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4670) Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier
[ https://issues.apache.org/jira/browse/LUCENE-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548759#comment-13548759 ] Robert Muir commented on LUCENE-4670: - {quote} Do you have an idea how to test it? {quote} I think a very simple solution would be to have two asserting codecs instead of one 1. Asserting(Lucene41) like we have today 2. Asserting(SimpleText). Its already a filtercodec, and this way we test the unoptimized default merge impls everywhere for sure, since simpletext never has such optimizations. Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier -- Key: LUCENE-4670 URL: https://issues.apache.org/jira/browse/LUCENE-4670 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.1 Attachments: LUCENE-4670.patch, LUCENE-4670.patch This is especially useful to LUCENE-4599 where actions have to be taken after a doc/field/term has been added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4670) Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier
[ https://issues.apache.org/jira/browse/LUCENE-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548765#comment-13548765 ] Adrien Grand commented on LUCENE-4670: -- bq. i think this looks better? Yes it does! bq. I can go either way on whether or not the finish() methods should be abstract though (versus having a no-op impl). A no-op impl would make the change backward compatible? Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier -- Key: LUCENE-4670 URL: https://issues.apache.org/jira/browse/LUCENE-4670 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.1 Attachments: LUCENE-4670.patch, LUCENE-4670.patch This is especially useful to LUCENE-4599 where actions have to be taken after a doc/field/term has been added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4670) Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier
[ https://issues.apache.org/jira/browse/LUCENE-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548767#comment-13548767 ] Robert Muir commented on LUCENE-4670: - I think for this case we should not consider API backwards compatibility (The codec api is experimental). Its far better to have the best APIs possible. It just seems confusing for finish(FieldInfos, int) to be abstract and the others not. On the other hand adding more abstract methods makes the API more overwhelming... Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier -- Key: LUCENE-4670 URL: https://issues.apache.org/jira/browse/LUCENE-4670 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.1 Attachments: LUCENE-4670.patch, LUCENE-4670.patch This is especially useful to LUCENE-4599 where actions have to be taken after a doc/field/term has been added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4670) Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier
[ https://issues.apache.org/jira/browse/LUCENE-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548777#comment-13548777 ] Robert Muir commented on LUCENE-4670: - {quote} I think a very simple solution would be to have two asserting codecs instead of one 1. Asserting(Lucene41) like we have today 2. Asserting(SimpleText) {quote} We don't technically need this actually. Our current bulk merge impls use instanceof, so they should never happen when we use the Asserting codec right? If this is really true then non-bulk merges are already tested a lot better than I thought :) Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier -- Key: LUCENE-4670 URL: https://issues.apache.org/jira/browse/LUCENE-4670 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.1 Attachments: LUCENE-4670.patch, LUCENE-4670.patch This is especially useful to LUCENE-4599 where actions have to be taken after a doc/field/term has been added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4671) CharsRef.subSequence broken
Tim Smith created LUCENE-4671: - Summary: CharsRef.subSequence broken Key: LUCENE-4671 URL: https://issues.apache.org/jira/browse/LUCENE-4671 Project: Lucene - Core Issue Type: Bug Reporter: Tim Smith Looks like CharsRef.subSequence() is currently broken It is implemented as: {code} @Override public CharSequence subSequence(int start, int end) { // NOTE: must do a real check here to meet the specs of CharSequence if (start 0 || end length || start end) { throw new IndexOutOfBoundsException(); } return new CharsRef(chars, offset + start, offset + end); } {code} Since CharsRef constructor is (char[] chars, int offset, int length), Should Be: {code} @Override public CharSequence subSequence(int start, int end) { // NOTE: must do a real check here to meet the specs of CharSequence if (start 0 || end length || start end) { throw new IndexOutOfBoundsException(); } return new CharsRef(chars, offset + start, end - start); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4671) CharsRef.subSequence broken
[ https://issues.apache.org/jira/browse/LUCENE-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548803#comment-13548803 ] Robert Muir commented on LUCENE-4671: - +1, this definitely looks wrong CharsRef.subSequence broken --- Key: LUCENE-4671 URL: https://issues.apache.org/jira/browse/LUCENE-4671 Project: Lucene - Core Issue Type: Bug Reporter: Tim Smith Looks like CharsRef.subSequence() is currently broken It is implemented as: {code} @Override public CharSequence subSequence(int start, int end) { // NOTE: must do a real check here to meet the specs of CharSequence if (start 0 || end length || start end) { throw new IndexOutOfBoundsException(); } return new CharsRef(chars, offset + start, offset + end); } {code} Since CharsRef constructor is (char[] chars, int offset, int length), Should Be: {code} @Override public CharSequence subSequence(int start, int end) { // NOTE: must do a real check here to meet the specs of CharSequence if (start 0 || end length || start end) { throw new IndexOutOfBoundsException(); } return new CharsRef(chars, offset + start, end - start); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4672) Add AssertingStoredFieldsFormat
[ https://issues.apache.org/jira/browse/LUCENE-4672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-4672: Attachment: LUCENE-4672.patch Add AssertingStoredFieldsFormat --- Key: LUCENE-4672 URL: https://issues.apache.org/jira/browse/LUCENE-4672 Project: Lucene - Core Issue Type: Test Reporter: Robert Muir Attachments: LUCENE-4672.patch just spun off from LUCENE-4670: this would be a good one to add as well. eventually we should probably try to cover the entire index. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-4671) CharsRef.subSequence broken
[ https://issues.apache.org/jira/browse/LUCENE-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir reassigned LUCENE-4671: --- Assignee: Robert Muir CharsRef.subSequence broken --- Key: LUCENE-4671 URL: https://issues.apache.org/jira/browse/LUCENE-4671 Project: Lucene - Core Issue Type: Bug Reporter: Tim Smith Assignee: Robert Muir Looks like CharsRef.subSequence() is currently broken It is implemented as: {code} @Override public CharSequence subSequence(int start, int end) { // NOTE: must do a real check here to meet the specs of CharSequence if (start 0 || end length || start end) { throw new IndexOutOfBoundsException(); } return new CharsRef(chars, offset + start, offset + end); } {code} Since CharsRef constructor is (char[] chars, int offset, int length), Should Be: {code} @Override public CharSequence subSequence(int start, int end) { // NOTE: must do a real check here to meet the specs of CharSequence if (start 0 || end length || start end) { throw new IndexOutOfBoundsException(); } return new CharsRef(chars, offset + start, end - start); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4671) CharsRef.subSequence broken
[ https://issues.apache.org/jira/browse/LUCENE-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-4671: Attachment: LUCENE-4671.patch patch CharsRef.subSequence broken --- Key: LUCENE-4671 URL: https://issues.apache.org/jira/browse/LUCENE-4671 Project: Lucene - Core Issue Type: Bug Reporter: Tim Smith Assignee: Robert Muir Attachments: LUCENE-4671.patch Looks like CharsRef.subSequence() is currently broken It is implemented as: {code} @Override public CharSequence subSequence(int start, int end) { // NOTE: must do a real check here to meet the specs of CharSequence if (start 0 || end length || start end) { throw new IndexOutOfBoundsException(); } return new CharsRef(chars, offset + start, offset + end); } {code} Since CharsRef constructor is (char[] chars, int offset, int length), Should Be: {code} @Override public CharSequence subSequence(int start, int end) { // NOTE: must do a real check here to meet the specs of CharSequence if (start 0 || end length || start end) { throw new IndexOutOfBoundsException(); } return new CharsRef(chars, offset + start, end - start); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Difference between softCommit and hardCommit?
Hi Mark, I have a query related to soft commit. This is my current setup same as example C of solrcloud wiki with embedded zookeeper running on 9983 only. numshard=2 Server1 : 2 solr instances running 8983(leader),8900(replica) Server2 : 2solr instance running 7574(leader),7500(replica). Now i added a document books.json to 8983 server with commit=false * http://localhost:8983/solr/update/json?stream.file=/SOLR-4/apache-solr-4.0.0/apache-solr-4.0.0/example/exampledocs/books.jsonstream.contentType=application/jsoncommit=false * Then did softcommit as follow : curl 'http://server1:8983/solr/update?softCommit=true' Now, i am getting strange outcome. On hitting this query multipie times i am getting different number of documents everytime. What could be reason for this. curl 'http://server1:8983/solr/collection1/select?q=*:*' curl 'http://server2:7574/solr/collection1/select?q=*:*' There are four documents in books.json. On hitting these queries again and again. Sometimes i get number of docs as 1, 3,4. Why result is not consistent. Ideally it shoudl return 4. As all nodes are up and healthy.(i.e green). Also, it would be great if you explain softcommit concept, it shows newly added document on query immediately. But is there is some difference if we do soft commit on replica not on leader?? and what if all servers go down and then up. This soft commit is lost or not? Thanks and then i did soft commit as follow : -- View this message in context: http://lucene.472066.n3.nabble.com/Difference-between-softCommit-and-hardCommit-tp4020425p4031865.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4671) CharsRef.subSequence broken
[ https://issues.apache.org/jira/browse/LUCENE-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548841#comment-13548841 ] Tim Smith commented on LUCENE-4671: --- looks like the index out of bounds check is a bit off too (if someone ever uses non-zero offsets) check should probably be: {code} if (start offset || end (offset + length) || start end) { throw new IndexOutOfBoundsException(); } {code} CharsRef.subSequence broken --- Key: LUCENE-4671 URL: https://issues.apache.org/jira/browse/LUCENE-4671 Project: Lucene - Core Issue Type: Bug Reporter: Tim Smith Assignee: Robert Muir Attachments: LUCENE-4671.patch Looks like CharsRef.subSequence() is currently broken It is implemented as: {code} @Override public CharSequence subSequence(int start, int end) { // NOTE: must do a real check here to meet the specs of CharSequence if (start 0 || end length || start end) { throw new IndexOutOfBoundsException(); } return new CharsRef(chars, offset + start, offset + end); } {code} Since CharsRef constructor is (char[] chars, int offset, int length), Should Be: {code} @Override public CharSequence subSequence(int start, int end) { // NOTE: must do a real check here to meet the specs of CharSequence if (start 0 || end length || start end) { throw new IndexOutOfBoundsException(); } return new CharsRef(chars, offset + start, end - start); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Deleted] (LUCENE-4671) CharsRef.subSequence broken
[ https://issues.apache.org/jira/browse/LUCENE-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Smith updated LUCENE-4671: -- Comment: was deleted (was: looks like the index out of bounds check is a bit off too (if someone ever uses non-zero offsets) check should probably be: {code} if (start offset || end (offset + length) || start end) { throw new IndexOutOfBoundsException(); } {code}) CharsRef.subSequence broken --- Key: LUCENE-4671 URL: https://issues.apache.org/jira/browse/LUCENE-4671 Project: Lucene - Core Issue Type: Bug Reporter: Tim Smith Assignee: Robert Muir Attachments: LUCENE-4671.patch Looks like CharsRef.subSequence() is currently broken It is implemented as: {code} @Override public CharSequence subSequence(int start, int end) { // NOTE: must do a real check here to meet the specs of CharSequence if (start 0 || end length || start end) { throw new IndexOutOfBoundsException(); } return new CharsRef(chars, offset + start, offset + end); } {code} Since CharsRef constructor is (char[] chars, int offset, int length), Should Be: {code} @Override public CharSequence subSequence(int start, int end) { // NOTE: must do a real check here to meet the specs of CharSequence if (start 0 || end length || start end) { throw new IndexOutOfBoundsException(); } return new CharsRef(chars, offset + start, end - start); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4671) CharsRef.subSequence broken
[ https://issues.apache.org/jira/browse/LUCENE-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548843#comment-13548843 ] Tim Smith commented on LUCENE-4671: --- looks good CharsRef.subSequence broken --- Key: LUCENE-4671 URL: https://issues.apache.org/jira/browse/LUCENE-4671 Project: Lucene - Core Issue Type: Bug Reporter: Tim Smith Assignee: Robert Muir Attachments: LUCENE-4671.patch Looks like CharsRef.subSequence() is currently broken It is implemented as: {code} @Override public CharSequence subSequence(int start, int end) { // NOTE: must do a real check here to meet the specs of CharSequence if (start 0 || end length || start end) { throw new IndexOutOfBoundsException(); } return new CharsRef(chars, offset + start, offset + end); } {code} Since CharsRef constructor is (char[] chars, int offset, int length), Should Be: {code} @Override public CharSequence subSequence(int start, int end) { // NOTE: must do a real check here to meet the specs of CharSequence if (start 0 || end length || start end) { throw new IndexOutOfBoundsException(); } return new CharsRef(chars, offset + start, end - start); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4671) CharsRef.subSequence broken
[ https://issues.apache.org/jira/browse/LUCENE-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548846#comment-13548846 ] Robert Muir commented on LUCENE-4671: - I think the out of bounds is correct (should not use offset, only the length) {noformat} Throws: IndexOutOfBoundsException - if start or end are negative, if end is greater than length(), or if start is greater than end {noformat} CharsRef.subSequence broken --- Key: LUCENE-4671 URL: https://issues.apache.org/jira/browse/LUCENE-4671 Project: Lucene - Core Issue Type: Bug Reporter: Tim Smith Assignee: Robert Muir Attachments: LUCENE-4671.patch Looks like CharsRef.subSequence() is currently broken It is implemented as: {code} @Override public CharSequence subSequence(int start, int end) { // NOTE: must do a real check here to meet the specs of CharSequence if (start 0 || end length || start end) { throw new IndexOutOfBoundsException(); } return new CharsRef(chars, offset + start, offset + end); } {code} Since CharsRef constructor is (char[] chars, int offset, int length), Should Be: {code} @Override public CharSequence subSequence(int start, int end) { // NOTE: must do a real check here to meet the specs of CharSequence if (start 0 || end length || start end) { throw new IndexOutOfBoundsException(); } return new CharsRef(chars, offset + start, end - start); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4673) TermQuery.toString() doesn't play nicely with whitespace
Itamar Syn-Hershko created LUCENE-4673: -- Summary: TermQuery.toString() doesn't play nicely with whitespace Key: LUCENE-4673 URL: https://issues.apache.org/jira/browse/LUCENE-4673 Project: Lucene - Core Issue Type: Bug Components: core/search Affects Versions: 3.6.2, 4.0-BETA, 4.1 Reporter: Itamar Syn-Hershko A TermQuery where term.text() contains whitespace outputs incorrect string representation: field:foo bar instead of field:foo bar. A correct representation is such that could be parsed again to the correct Query object (using the correct analyzer, yes, but still). This may not be so critical, but in our system we use Lucene's QP to parse and then pre-process and optimize user queries. To do that we use Query.toString on some clauses to rebuild the query string. This can be easily resolved by always adding quote marks before and after the term text in TermQuery.toString. Testing to see if they are required or not is too much work and TermQuery is ignorant of quote marks anyway. Some other scenarios which could benefit from this change is places where escaped characters are used, such as URLs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4671) CharsRef.subSequence broken
[ https://issues.apache.org/jira/browse/LUCENE-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548848#comment-13548848 ] Tim Smith commented on LUCENE-4671: --- it is, that's why i deleted the comment, just looked wrong to me for a moment CharsRef.subSequence broken --- Key: LUCENE-4671 URL: https://issues.apache.org/jira/browse/LUCENE-4671 Project: Lucene - Core Issue Type: Bug Reporter: Tim Smith Assignee: Robert Muir Attachments: LUCENE-4671.patch Looks like CharsRef.subSequence() is currently broken It is implemented as: {code} @Override public CharSequence subSequence(int start, int end) { // NOTE: must do a real check here to meet the specs of CharSequence if (start 0 || end length || start end) { throw new IndexOutOfBoundsException(); } return new CharsRef(chars, offset + start, offset + end); } {code} Since CharsRef constructor is (char[] chars, int offset, int length), Should Be: {code} @Override public CharSequence subSequence(int start, int end) { // NOTE: must do a real check here to meet the specs of CharSequence if (start 0 || end length || start end) { throw new IndexOutOfBoundsException(); } return new CharsRef(chars, offset + start, end - start); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4671) CharsRef.subSequence broken
[ https://issues.apache.org/jira/browse/LUCENE-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-4671: Fix Version/s: 3.6.3 5.0 4.1 CharsRef.subSequence broken --- Key: LUCENE-4671 URL: https://issues.apache.org/jira/browse/LUCENE-4671 Project: Lucene - Core Issue Type: Bug Reporter: Tim Smith Assignee: Robert Muir Fix For: 4.1, 5.0, 3.6.3 Attachments: LUCENE-4671.patch Looks like CharsRef.subSequence() is currently broken It is implemented as: {code} @Override public CharSequence subSequence(int start, int end) { // NOTE: must do a real check here to meet the specs of CharSequence if (start 0 || end length || start end) { throw new IndexOutOfBoundsException(); } return new CharsRef(chars, offset + start, offset + end); } {code} Since CharsRef constructor is (char[] chars, int offset, int length), Should Be: {code} @Override public CharSequence subSequence(int start, int end) { // NOTE: must do a real check here to meet the specs of CharSequence if (start 0 || end length || start end) { throw new IndexOutOfBoundsException(); } return new CharsRef(chars, offset + start, end - start); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4672) Add AssertingStoredFieldsFormat
[ https://issues.apache.org/jira/browse/LUCENE-4672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548852#comment-13548852 ] Adrien Grand commented on LUCENE-4672: -- +1 Add AssertingStoredFieldsFormat --- Key: LUCENE-4672 URL: https://issues.apache.org/jira/browse/LUCENE-4672 Project: Lucene - Core Issue Type: Test Reporter: Robert Muir Attachments: LUCENE-4672.patch just spun off from LUCENE-4670: this would be a good one to add as well. eventually we should probably try to cover the entire index. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4673) TermQuery.toString() doesn't play nicely with whitespace
[ https://issues.apache.org/jira/browse/LUCENE-4673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548854#comment-13548854 ] Robert Muir commented on LUCENE-4673: - quotes have a particular meaning to the queryparser so adding quotes would change things. In general queries' toString is just to be user-readable, not re-parsable. They don't escape syntax character and so on. TermQuery.toString() doesn't play nicely with whitespace Key: LUCENE-4673 URL: https://issues.apache.org/jira/browse/LUCENE-4673 Project: Lucene - Core Issue Type: Bug Components: core/search Affects Versions: 4.0-BETA, 4.1, 3.6.2 Reporter: Itamar Syn-Hershko A TermQuery where term.text() contains whitespace outputs incorrect string representation: field:foo bar instead of field:foo bar. A correct representation is such that could be parsed again to the correct Query object (using the correct analyzer, yes, but still). This may not be so critical, but in our system we use Lucene's QP to parse and then pre-process and optimize user queries. To do that we use Query.toString on some clauses to rebuild the query string. This can be easily resolved by always adding quote marks before and after the term text in TermQuery.toString. Testing to see if they are required or not is too much work and TermQuery is ignorant of quote marks anyway. Some other scenarios which could benefit from this change is places where escaped characters are used, such as URLs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4671) CharsRef.subSequence broken
[ https://issues.apache.org/jira/browse/LUCENE-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548856#comment-13548856 ] Robert Muir commented on LUCENE-4671: - Tim: ah sorry I see. was a race condition with jira :) CharsRef.subSequence broken --- Key: LUCENE-4671 URL: https://issues.apache.org/jira/browse/LUCENE-4671 Project: Lucene - Core Issue Type: Bug Reporter: Tim Smith Assignee: Robert Muir Fix For: 4.1, 5.0, 3.6.3 Attachments: LUCENE-4671.patch Looks like CharsRef.subSequence() is currently broken It is implemented as: {code} @Override public CharSequence subSequence(int start, int end) { // NOTE: must do a real check here to meet the specs of CharSequence if (start 0 || end length || start end) { throw new IndexOutOfBoundsException(); } return new CharsRef(chars, offset + start, offset + end); } {code} Since CharsRef constructor is (char[] chars, int offset, int length), Should Be: {code} @Override public CharSequence subSequence(int start, int end) { // NOTE: must do a real check here to meet the specs of CharSequence if (start 0 || end length || start end) { throw new IndexOutOfBoundsException(); } return new CharsRef(chars, offset + start, end - start); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1186 [1/3] - /release/lucene/KEYS
sarowe: i see this is related to INFRA-5739 and LUCENE-4134, but where did this KEYS file come from? It doesn't seem to match either of the primary copies we currently have in the dist dir (which are in sync)... https://www.apache.org/dist/lucene/java/KEYS https://www.apache.org/dist/lucene/solr/KEYS ...nor does it match the out of date top-level KEYS file that i thought was also in sync (but clearly is not)... https://www.apache.org/dist/lucene/KEYS : Date: Wed, 09 Jan 2013 03:05:22 - : From: sar...@apache.org : Reply-To: dev@lucene.apache.org : To: comm...@lucene.apache.org : Subject: svn commit: r1186 [1/3] - /release/lucene/KEYS : : Author: sarowe : Date: Wed Jan 9 03:05:20 2013 : New Revision: 1186 : : Log: : Added KEYS : : Added: : release/lucene/KEYS (with props) : : -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4670) Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier
[ https://issues.apache.org/jira/browse/LUCENE-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548859#comment-13548859 ] Adrien Grand commented on LUCENE-4670: -- bq. I think for this case we should not consider API backwards compatibility (The codec api is experimental). bq. On the other hand adding more abstract methods makes the API more overwhelming... I'm fine with both options. bq. We don't technically need this actually. Our current bulk merge impls use instanceof, so they should never happen when we use the Asserting codec right? Right. Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier -- Key: LUCENE-4670 URL: https://issues.apache.org/jira/browse/LUCENE-4670 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.1 Attachments: LUCENE-4670.patch, LUCENE-4670.patch This is especially useful to LUCENE-4599 where actions have to be taken after a doc/field/term has been added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1186 [1/3] - /release/lucene/KEYS
For KEYS I have been using http://people.apache.org/keys/group/lucene.asc myself. Its my understanding this is the most recent version of the file. Those static copies everywhere else are... just annoying. On Wed, Jan 9, 2013 at 2:20 PM, Chris Hostetter hossman_luc...@fucit.org wrote: sarowe: i see this is related to INFRA-5739 and LUCENE-4134, but where did this KEYS file come from? It doesn't seem to match either of the primary copies we currently have in the dist dir (which are in sync)... https://www.apache.org/dist/lucene/java/KEYS https://www.apache.org/dist/lucene/solr/KEYS ...nor does it match the out of date top-level KEYS file that i thought was also in sync (but clearly is not)... https://www.apache.org/dist/lucene/KEYS : Date: Wed, 09 Jan 2013 03:05:22 - : From: sar...@apache.org : Reply-To: dev@lucene.apache.org : To: comm...@lucene.apache.org : Subject: svn commit: r1186 [1/3] - /release/lucene/KEYS : : Author: sarowe : Date: Wed Jan 9 03:05:20 2013 : New Revision: 1186 : : Log: : Added KEYS : : Added: : release/lucene/KEYS (with props) : : -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1186 [1/3] - /release/lucene/KEYS
I went and looked at http://wiki.apache.org/lucene-java/ReleaseTodo#Building_the_Release_artifacts and copied what appeared in my web browser when I went to http://people.apache.org/keys/group/lucene.asc. I was guessing that this thing is auto-generated, but maybe not? Is there some more authoritative version somewhere? Steve On Jan 9, 2013, at 2:20 PM, Chris Hostetter hossman_luc...@fucit.org wrote: sarowe: i see this is related to INFRA-5739 and LUCENE-4134, but where did this KEYS file come from? It doesn't seem to match either of the primary copies we currently have in the dist dir (which are in sync)... https://www.apache.org/dist/lucene/java/KEYS https://www.apache.org/dist/lucene/solr/KEYS ...nor does it match the out of date top-level KEYS file that i thought was also in sync (but clearly is not)... https://www.apache.org/dist/lucene/KEYS : Date: Wed, 09 Jan 2013 03:05:22 - : From: sar...@apache.org : Reply-To: dev@lucene.apache.org : To: comm...@lucene.apache.org : Subject: svn commit: r1186 [1/3] - /release/lucene/KEYS : : Author: sarowe : Date: Wed Jan 9 03:05:20 2013 : New Revision: 1186 : : Log: : Added KEYS : : Added: : release/lucene/KEYS (with props) : : -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1186 [1/3] - /release/lucene/KEYS
On Wed, Jan 9, 2013 at 2:23 PM, Steve Rowe sar...@gmail.com wrote: I went and looked at http://wiki.apache.org/lucene-java/ReleaseTodo#Building_the_Release_artifacts and copied what appeared in my web browser when I went to http://people.apache.org/keys/group/lucene.asc. I was guessing that this thing is auto-generated, but maybe not? Is there some more authoritative version somewhere? Its the best I think, at least according to the way its described in http://people.apache.org/keys/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1186 [1/3] - /release/lucene/KEYS
: I went and looked at http://wiki.apache.org/lucene-java/ReleaseTodo#Building_the_Release_artifacts and copied what appeared in my web browser when I went to http://people.apache.org/keys/group/lucene.asc. : : I was guessing that this thing is auto-generated, but maybe not? Is there some more authoritative version somewhere? Yes, yes ... i'd forgotten about keys being kept in LDAP now... https://people.apache.org/keys/ We should probably take this dist migration time as an oportunity to eliminate those static copies and make them redirect to that new canonical URL ... but in that case i'm not sure I understand why Gavin asked you to copy KEYS into the new repo. -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4673) TermQuery.toString() doesn't play nicely with whitespace
[ https://issues.apache.org/jira/browse/LUCENE-4673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548874#comment-13548874 ] Itamar Syn-Hershko commented on LUCENE-4673: I figured as much, yet we would definitely like to have use this behavior built-in. Are there any plans on making such an interface to perform a proper Query - String conversion? TermQuery.toString() doesn't play nicely with whitespace Key: LUCENE-4673 URL: https://issues.apache.org/jira/browse/LUCENE-4673 Project: Lucene - Core Issue Type: Bug Components: core/search Affects Versions: 4.0-BETA, 4.1, 3.6.2 Reporter: Itamar Syn-Hershko A TermQuery where term.text() contains whitespace outputs incorrect string representation: field:foo bar instead of field:foo bar. A correct representation is such that could be parsed again to the correct Query object (using the correct analyzer, yes, but still). This may not be so critical, but in our system we use Lucene's QP to parse and then pre-process and optimize user queries. To do that we use Query.toString on some clauses to rebuild the query string. This can be easily resolved by always adding quote marks before and after the term text in TermQuery.toString. Testing to see if they are required or not is too much work and TermQuery is ignorant of quote marks anyway. Some other scenarios which could benefit from this change is places where escaped characters are used, such as URLs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4671) CharsRef.subSequence broken
[ https://issues.apache.org/jira/browse/LUCENE-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548878#comment-13548878 ] Commit Tag Bot commented on LUCENE-4671: [trunk commit] Robert Muir http://svn.apache.org/viewvc?view=revisionrevision=1431019 LUCENE-4671: Fix CharsRef.subSequence method CharsRef.subSequence broken --- Key: LUCENE-4671 URL: https://issues.apache.org/jira/browse/LUCENE-4671 Project: Lucene - Core Issue Type: Bug Reporter: Tim Smith Assignee: Robert Muir Fix For: 4.1, 5.0, 3.6.3 Attachments: LUCENE-4671.patch Looks like CharsRef.subSequence() is currently broken It is implemented as: {code} @Override public CharSequence subSequence(int start, int end) { // NOTE: must do a real check here to meet the specs of CharSequence if (start 0 || end length || start end) { throw new IndexOutOfBoundsException(); } return new CharsRef(chars, offset + start, offset + end); } {code} Since CharsRef constructor is (char[] chars, int offset, int length), Should Be: {code} @Override public CharSequence subSequence(int start, int end) { // NOTE: must do a real check here to meet the specs of CharSequence if (start 0 || end length || start end) { throw new IndexOutOfBoundsException(); } return new CharsRef(chars, offset + start, end - start); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4672) Add AssertingStoredFieldsFormat
[ https://issues.apache.org/jira/browse/LUCENE-4672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548880#comment-13548880 ] Commit Tag Bot commented on LUCENE-4672: [trunk commit] Robert Muir http://svn.apache.org/viewvc?view=revisionrevision=1431024 LUCENE-4672: add AssertingStoredFieldsFormat Add AssertingStoredFieldsFormat --- Key: LUCENE-4672 URL: https://issues.apache.org/jira/browse/LUCENE-4672 Project: Lucene - Core Issue Type: Test Reporter: Robert Muir Attachments: LUCENE-4672.patch just spun off from LUCENE-4670: this would be a good one to add as well. eventually we should probably try to cover the entire index. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4673) TermQuery.toString() doesn't play nicely with whitespace
[ https://issues.apache.org/jira/browse/LUCENE-4673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548881#comment-13548881 ] Hoss Man commented on LUCENE-4673: -- I really don't understand why we completely eliminated the helpful javadoc that tried to explain this to people back in the day (circa Lucene 2.9)... {noformat} /** Prints a query to a string, with codefield/code assumed to be the * default field and omitted. * pThe representation used is one that is supposed to be readable * by {@link org.apache.lucene.queryParser.QueryParser QueryParser}. However, * there are the following limitations: * ul * liIf the query was created by the parser, the printed * representation may not be exactly what was parsed. For example, * characters that need to be escaped will be represented without * the required backslash./li * liSome of the more complicated queries (e.g. span queries) * don't have a representation that can be parsed by QueryParser./li * /ul */ public abstract String toString(String field); {noformat} ...it wasn't perfect, but it could have been improved instead of deleted... {noformat} Prints a query to a string, with codefield/code assumed to be the default field and omitted. pThe String representation generated is visually similar to that parsed by {@link org.apache.lucene.queryParser.QueryParser QueryParser} for convinience, however it is not guaranteed to produce an identical query object when parsed, since not all permutations of Query objects can be created by the QueryParser. {noformat} TermQuery.toString() doesn't play nicely with whitespace Key: LUCENE-4673 URL: https://issues.apache.org/jira/browse/LUCENE-4673 Project: Lucene - Core Issue Type: Bug Components: core/search Affects Versions: 4.0-BETA, 4.1, 3.6.2 Reporter: Itamar Syn-Hershko A TermQuery where term.text() contains whitespace outputs incorrect string representation: field:foo bar instead of field:foo bar. A correct representation is such that could be parsed again to the correct Query object (using the correct analyzer, yes, but still). This may not be so critical, but in our system we use Lucene's QP to parse and then pre-process and optimize user queries. To do that we use Query.toString on some clauses to rebuild the query string. This can be easily resolved by always adding quote marks before and after the term text in TermQuery.toString. Testing to see if they are required or not is too much work and TermQuery is ignorant of quote marks anyway. Some other scenarios which could benefit from this change is places where escaped characters are used, such as URLs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3178) Native MMapDir
[ https://issues.apache.org/jira/browse/LUCENE-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548885#comment-13548885 ] Greg Bowyer commented on LUCENE-3178: - Frustrating, it echos what I have been seeing so at least my benchmarking is not playing me up, I guess I will have to do some digging. Native MMapDir -- Key: LUCENE-3178 URL: https://issues.apache.org/jira/browse/LUCENE-3178 Project: Lucene - Core Issue Type: Improvement Components: core/store Reporter: Michael McCandless Labels: gsoc2012, lucene-gsoc-12 Attachments: LUCENE-3178-Native-MMap-implementation.patch, LUCENE-3178-Native-MMap-implementation.patch, LUCENE-3178-Native-MMap-implementation.patch Spinoff from LUCENE-2793. Just like we will create native Dir impl (UnixDirectory) to pass the right OS level IO flags depending on the IOContext, we could in theory do something similar with MMapDir. The problem is MMap is apparently quite hairy... and to pass the flags the native code would need to invoke mmap (I think?), unlike UnixDir where the code only has to open the file handle. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4134) Cannot set multiple values into multivalued field with partial updates when using the standard RequestWriter.
[ https://issues.apache.org/jira/browse/SOLR-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548898#comment-13548898 ] Erik Hatcher commented on SOLR-4134: Shalin - can you add a CHANGES entry for this one too so it's easier for others that experience this issue with 4.0 to see that it got fixed in 4.1+? Thanks! Cannot set multiple values into multivalued field with partial updates when using the standard RequestWriter. --- Key: SOLR-4134 URL: https://issues.apache.org/jira/browse/SOLR-4134 Project: Solr Issue Type: Bug Components: clients - java, update Affects Versions: 4.0 Reporter: Will Butler Assignee: Shalin Shekhar Mangar Priority: Minor Fix For: 4.1 Attachments: SOLR-4134-nolist-fix.patch, SOLR-4134.patch, SOLR-4134.patch I would like to set multiple values into a field using partial updates like so: \\ \\ {code} ListString values = new ArrayListString(); values.add(one); values.add(two); values.add(three); doc.setField(field, singletonMap(set, values)); {code} When using the standard XML-based RequestWriter, you end up with a single value that looks like [one, two, three], because of the toString() calls on lines 130 and 132 of ClientUtils. It works properly when using the BinaryRequestWriter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-4286) Atomic Updates on multi-valued fields giving unexpected results
[ https://issues.apache.org/jira/browse/SOLR-4286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man reassigned SOLR-4286: -- Assignee: Shalin Shekhar Mangar shalin mentioned on irc that he'd take a look at this Atomic Updates on multi-valued fields giving unexpected results --- Key: SOLR-4286 URL: https://issues.apache.org/jira/browse/SOLR-4286 Project: Solr Issue Type: Bug Components: update Affects Versions: 4.0 Environment: Windows 7 64-bit Reporter: Abhinav Shah Assignee: Shalin Shekhar Mangar I am using apache-solr 4.0. I am trying to post the following document - {code} curl http://irvis016:8983/solr/collection1/update?commit=true -H Content-Type: text/xml --data-binary 'add commitWithin=5000doc boost=1.0field name=accessionNumber update=set3165297/fieldfield name=status update=setORDERED/fieldfield name=account.accountName update=setUS LABS DEMO ACCOUNT/fieldfield name=account.addresses.address1 update=set2601 Campus Drive/fieldfield name=account.addresses.city update=setIrvine/fieldfield name=account.addresses.state update=setCA/fieldfield name=account.addresses.zip update=set92622/fieldfield name=account.externalIds.sourceSystem update=set10442/fieldfield name=orderingPhysician.lcProviderNumber update=set60086/fieldfield name=patient.lpid update=set5571351625769103/fieldfield name=patient.patientName.lastName update=settest/fieldfield name=patient.patientName.firstName update=settest123/fieldfield name=patient.patientSSN update=set643522342/fieldfield name=patient.patientDOB update=set1979-11-11T08:00:00.000Z/fieldfield name=patient.mrNs.mrn update=set5423/fieldfield name=specimens.specimenType update=setBone Marrow/fieldfield name=specimens.specimenType update=setNerve tissue/fieldfield name=UID3165297USLABS2012/field/doc/add' {code} This document gets successfully posted. However, the multi-valued field 'specimens.specimenType', gets stored as following in SOLR - {code} arr name=specimens.specimenType str{set=Bone Marrow}/str str{set=Nerve tissue}/str /arr {code} I did not expect {set= to be stored along with the text Bone Marror. My Solr schema xml definition for the field specimens.SpecimenType is - {code} field indexed=true multiValued=true name=specimens.specimenType omitNorms=false omitPositions=true omitTermFreqAndPositions=true stored=true termVectors=false type=text_en/ {code} Can someone help? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4672) Add AssertingStoredFieldsFormat
[ https://issues.apache.org/jira/browse/LUCENE-4672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-4672. - Resolution: Fixed Fix Version/s: 5.0 4.1 Add AssertingStoredFieldsFormat --- Key: LUCENE-4672 URL: https://issues.apache.org/jira/browse/LUCENE-4672 Project: Lucene - Core Issue Type: Test Reporter: Robert Muir Fix For: 4.1, 5.0 Attachments: LUCENE-4672.patch just spun off from LUCENE-4670: this would be a good one to add as well. eventually we should probably try to cover the entire index. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4671) CharsRef.subSequence broken
[ https://issues.apache.org/jira/browse/LUCENE-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548905#comment-13548905 ] Commit Tag Bot commented on LUCENE-4671: [branch_4x commit] Robert Muir http://svn.apache.org/viewvc?view=revisionrevision=1431029 LUCENE-4671: Fix CharsRef.subSequence method CharsRef.subSequence broken --- Key: LUCENE-4671 URL: https://issues.apache.org/jira/browse/LUCENE-4671 Project: Lucene - Core Issue Type: Bug Reporter: Tim Smith Assignee: Robert Muir Fix For: 4.1, 5.0, 3.6.3 Attachments: LUCENE-4671.patch Looks like CharsRef.subSequence() is currently broken It is implemented as: {code} @Override public CharSequence subSequence(int start, int end) { // NOTE: must do a real check here to meet the specs of CharSequence if (start 0 || end length || start end) { throw new IndexOutOfBoundsException(); } return new CharsRef(chars, offset + start, offset + end); } {code} Since CharsRef constructor is (char[] chars, int offset, int length), Should Be: {code} @Override public CharSequence subSequence(int start, int end) { // NOTE: must do a real check here to meet the specs of CharSequence if (start 0 || end length || start end) { throw new IndexOutOfBoundsException(); } return new CharsRef(chars, offset + start, end - start); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4670) Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier
[ https://issues.apache.org/jira/browse/LUCENE-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548909#comment-13548909 ] Robert Muir commented on LUCENE-4670: - OK... maybe for now just go forward with the no-op impls? Its redundant anyway so if you dont notice that these extra hooks are available it won't hurt you, unless you are doing some kind of wrapping or so on (we should look for that just in case). I added the assertingstoredfields so we should be able to also move the 'assert fieldCount == 0' to its finishDocument() Add TermVectorsWriter.finish{Doc,Field,Term} to make development of new formats easier -- Key: LUCENE-4670 URL: https://issues.apache.org/jira/browse/LUCENE-4670 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.1 Attachments: LUCENE-4670.patch, LUCENE-4670.patch This is especially useful to LUCENE-4599 where actions have to be taken after a doc/field/term has been added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4286) Atomic Updates on multi-valued fields giving unexpected results
[ https://issues.apache.org/jira/browse/SOLR-4286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548911#comment-13548911 ] Abhinav Shah commented on SOLR-4286: SOLR-4134 mentions that it works with BinaryRequestWriter. However, I am using Solrj API which in turn uses BinaryRequestWriter. So I was expecting it to work, but it doesn't. Atomic Updates on multi-valued fields giving unexpected results --- Key: SOLR-4286 URL: https://issues.apache.org/jira/browse/SOLR-4286 Project: Solr Issue Type: Bug Components: update Affects Versions: 4.0 Environment: Windows 7 64-bit Reporter: Abhinav Shah Assignee: Shalin Shekhar Mangar I am using apache-solr 4.0. I am trying to post the following document - {code} curl http://irvis016:8983/solr/collection1/update?commit=true -H Content-Type: text/xml --data-binary 'add commitWithin=5000doc boost=1.0field name=accessionNumber update=set3165297/fieldfield name=status update=setORDERED/fieldfield name=account.accountName update=setUS LABS DEMO ACCOUNT/fieldfield name=account.addresses.address1 update=set2601 Campus Drive/fieldfield name=account.addresses.city update=setIrvine/fieldfield name=account.addresses.state update=setCA/fieldfield name=account.addresses.zip update=set92622/fieldfield name=account.externalIds.sourceSystem update=set10442/fieldfield name=orderingPhysician.lcProviderNumber update=set60086/fieldfield name=patient.lpid update=set5571351625769103/fieldfield name=patient.patientName.lastName update=settest/fieldfield name=patient.patientName.firstName update=settest123/fieldfield name=patient.patientSSN update=set643522342/fieldfield name=patient.patientDOB update=set1979-11-11T08:00:00.000Z/fieldfield name=patient.mrNs.mrn update=set5423/fieldfield name=specimens.specimenType update=setBone Marrow/fieldfield name=specimens.specimenType update=setNerve tissue/fieldfield name=UID3165297USLABS2012/field/doc/add' {code} This document gets successfully posted. However, the multi-valued field 'specimens.specimenType', gets stored as following in SOLR - {code} arr name=specimens.specimenType str{set=Bone Marrow}/str str{set=Nerve tissue}/str /arr {code} I did not expect {set= to be stored along with the text Bone Marror. My Solr schema xml definition for the field specimens.SpecimenType is - {code} field indexed=true multiValued=true name=specimens.specimenType omitNorms=false omitPositions=true omitTermFreqAndPositions=true stored=true termVectors=false type=text_en/ {code} Can someone help? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-1947) Switch links to JSPs on /admin to their handler equivilents
[ https://issues.apache.org/jira/browse/SOLR-1947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-1947. Resolution: Fixed Fix Version/s: 4.0 Assignee: Stefan Matheis (steffkes) this was fixed in 4.0 by comletley eliminating the jsp's and making the entire admin UI use request handlers Switch links to JSPs on /admin to their handler equivilents --- Key: SOLR-1947 URL: https://issues.apache.org/jira/browse/SOLR-1947 Project: Solr Issue Type: Improvement Components: web gui Reporter: Hoss Man Assignee: Stefan Matheis (steffkes) Fix For: 4.0 Attachments: SOLR-1947.patch We should update all links on the admin screen to point at the RequestHandler equivalents where they exist. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4671) CharsRef.subSequence broken
[ https://issues.apache.org/jira/browse/LUCENE-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-4671. - Resolution: Fixed Thank you Tim! CharsRef.subSequence broken --- Key: LUCENE-4671 URL: https://issues.apache.org/jira/browse/LUCENE-4671 Project: Lucene - Core Issue Type: Bug Reporter: Tim Smith Assignee: Robert Muir Fix For: 4.1, 5.0, 3.6.3 Attachments: LUCENE-4671.patch Looks like CharsRef.subSequence() is currently broken It is implemented as: {code} @Override public CharSequence subSequence(int start, int end) { // NOTE: must do a real check here to meet the specs of CharSequence if (start 0 || end length || start end) { throw new IndexOutOfBoundsException(); } return new CharsRef(chars, offset + start, offset + end); } {code} Since CharsRef constructor is (char[] chars, int offset, int length), Should Be: {code} @Override public CharSequence subSequence(int start, int end) { // NOTE: must do a real check here to meet the specs of CharSequence if (start 0 || end length || start end) { throw new IndexOutOfBoundsException(); } return new CharsRef(chars, offset + start, end - start); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4289) Admin UI - JVM memory bar - dark grey used width is too small
Shawn Heisey created SOLR-4289: -- Summary: Admin UI - JVM memory bar - dark grey used width is too small Key: SOLR-4289 URL: https://issues.apache.org/jira/browse/SOLR-4289 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 4.0 Reporter: Shawn Heisey Fix For: 4.1, 5.0 Attachments: screenshot-1.jpg See attached screenshot. The red line shows approximately where the dark grey section *should* end. On the screenshot, the used memory is shown as 9.3% ... the width of the dark grey section (corresponding to used memory) should be 9.3% of the width of the full bar (max memory), but it appears that it is actually about 9.3% of the width of the mid-grey section (total memory). I am testing a potential patch right now, will attach it if it looks like it's working right. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4289) Admin UI - JVM memory bar - dark grey used width is too small
[ https://issues.apache.org/jira/browse/SOLR-4289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Heisey updated SOLR-4289: --- Attachment: screenshot-1.jpg Admin UI - JVM memory bar - dark grey used width is too small --- Key: SOLR-4289 URL: https://issues.apache.org/jira/browse/SOLR-4289 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 4.0 Reporter: Shawn Heisey Fix For: 4.1, 5.0 Attachments: screenshot-1.jpg See attached screenshot. The red line shows approximately where the dark grey section *should* end. On the screenshot, the used memory is shown as 9.3% ... the width of the dark grey section (corresponding to used memory) should be 9.3% of the width of the full bar (max memory), but it appears that it is actually about 9.3% of the width of the mid-grey section (total memory). I am testing a potential patch right now, will attach it if it looks like it's working right. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4289) Admin UI - JVM memory bar - dark grey used width is too small
[ https://issues.apache.org/jira/browse/SOLR-4289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Heisey updated SOLR-4289: --- Attachment: (was: screenshot-1.jpg) Admin UI - JVM memory bar - dark grey used width is too small --- Key: SOLR-4289 URL: https://issues.apache.org/jira/browse/SOLR-4289 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 4.0 Reporter: Shawn Heisey Fix For: 4.1, 5.0 Attachments: screenshot-1.jpg See attached screenshot. The red line shows approximately where the dark grey section *should* end. On the screenshot, the used memory is shown as 9.3% ... the width of the dark grey section (corresponding to used memory) should be 9.3% of the width of the full bar (max memory), but it appears that it is actually about 9.3% of the width of the mid-grey section (total memory). I am testing a potential patch right now, will attach it if it looks like it's working right. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4289) Admin UI - JVM memory bar - dark grey used width is too small
[ https://issues.apache.org/jira/browse/SOLR-4289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Heisey updated SOLR-4289: --- Attachment: screenshot-1.jpg Admin UI - JVM memory bar - dark grey used width is too small --- Key: SOLR-4289 URL: https://issues.apache.org/jira/browse/SOLR-4289 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 4.0 Reporter: Shawn Heisey Fix For: 4.1, 5.0 Attachments: screenshot-1.jpg See attached screenshot. The red line shows approximately where the dark grey section *should* end. On the screenshot, the used memory is shown as 9.3% ... the width of the dark grey section (corresponding to used memory) should be 9.3% of the width of the full bar (max memory), but it appears that it is actually about 9.3% of the width of the mid-grey section (total memory). I am testing a potential patch right now, will attach it if it looks like it's working right. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org