[jira] [Commented] (SOLR-4237) Implement index aliasing
[ https://issues.apache.org/jira/browse/SOLR-4237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603174#comment-13603174 ] Mark Miller commented on SOLR-4237: --- I couldn't say - it depends on what you intended to implement for index aliases here. I have very specifics thought about what I intend for collection and shard aliasing. Implement index aliasing Key: SOLR-4237 URL: https://issues.apache.org/jira/browse/SOLR-4237 Project: Solr Issue Type: New Feature Reporter: Otis Gospodnetic Fix For: 4.3 This is handy for searching log indices and in all other situations where indices are added (and possibly deleted) over time. Index aliasing allows one to map an arbitrary set of indices to an alias and avoid needing to change the search application to point it to new indices. See http://search-lucene.com/m/YBn4w1UAbEB It may also be worth thinking about using aliases when indexing. This question comes up once in a while on the ElasticSearch mailing list for example. See http://search-lucene.com/?q=index+time+aliasfc_project=ElasticSearchfc_type=mail+_hash_+user -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4830) Sorter API: use an abstract doc map instead of an array
[ https://issues.apache.org/jira/browse/LUCENE-4830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603203#comment-13603203 ] Commit Tag Bot commented on LUCENE-4830: [trunk commit] Adrien Grand http://svn.apache.org/viewvc?view=revisionrevision=1456787 LUCENE-4830: Sorter API: Make the doc ID mapping an abstract class. Sorter API: use an abstract doc map instead of an array --- Key: LUCENE-4830 URL: https://issues.apache.org/jira/browse/LUCENE-4830 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Trivial Fix For: 4.3 Attachments: LUCENE-4830.patch The sorter API uses arrays to store the old-new and new-old doc IDs mappings. It should rather be an abstract class given that in some cases an array is not required at all (reverse mapping for example). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Protecting content in zookeeper
On 3/14/13 5:21 PM, Mark Miller wrote: On Mar 14, 2013, at 3:16 AM, Per Steffensen st...@designware.dk wrote: Even though you do not share zookeeper you might want to set up permissions anyway, but never mind. That's just the only reason I care about. Otherwise, I'm of a similar mind with ZooKeeper security as I am with Solr - lock it up behind closed doors and only allow trusted access. Makes things simpler for us. The problem is that it's really nice to only have to run one ZooKeeper for many services. And in that case, it's really nice to ensure they won't interfere with each other due to bugs or misconfiguration. So for that reason, I'd support this change. The other reasons really don't sway me at all. Thats cool. I understand your points. Its just that my customer is very very paranoid - like in CIA'ish paranoid. We do not even trust people with access to the actual machines running Solr or ZK (at least not all of them, depending on how many starts are on their shoulders), and we need to run CloudSolrServers in an environment where we trust people even less. I know this is probably not a feature that will be used by many people, but what the heck, if its transparent and default is no protection, no harm done supporting it. And it will not be a lot of code. Regards, Steff - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4830) Sorter API: use an abstract doc map instead of an array
[ https://issues.apache.org/jira/browse/LUCENE-4830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603207#comment-13603207 ] Commit Tag Bot commented on LUCENE-4830: [branch_4x commit] Adrien Grand http://svn.apache.org/viewvc?view=revisionrevision=1456789 LUCENE-4830: Sorter API: Make the doc ID mapping an abstract class (merged from r1456787). Sorter API: use an abstract doc map instead of an array --- Key: LUCENE-4830 URL: https://issues.apache.org/jira/browse/LUCENE-4830 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Trivial Fix For: 4.3 Attachments: LUCENE-4830.patch The sorter API uses arrays to store the old-new and new-old doc IDs mappings. It should rather be an abstract class given that in some cases an array is not required at all (reverse mapping for example). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4830) Sorter API: use an abstract doc map instead of an array
[ https://issues.apache.org/jira/browse/LUCENE-4830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603211#comment-13603211 ] Commit Tag Bot commented on LUCENE-4830: [trunk commit] Adrien Grand http://svn.apache.org/viewvc?view=revisionrevision=1456796 LUCENE-4830: Add missing @Override. Sorter API: use an abstract doc map instead of an array --- Key: LUCENE-4830 URL: https://issues.apache.org/jira/browse/LUCENE-4830 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Trivial Fix For: 4.3 Attachments: LUCENE-4830.patch The sorter API uses arrays to store the old-new and new-old doc IDs mappings. It should rather be an abstract class given that in some cases an array is not required at all (reverse mapping for example). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4830) Sorter API: use an abstract doc map instead of an array
[ https://issues.apache.org/jira/browse/LUCENE-4830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-4830. -- Resolution: Fixed Thank you for the review, Shai! Sorter API: use an abstract doc map instead of an array --- Key: LUCENE-4830 URL: https://issues.apache.org/jira/browse/LUCENE-4830 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Trivial Fix For: 4.3 Attachments: LUCENE-4830.patch The sorter API uses arrays to store the old-new and new-old doc IDs mappings. It should rather be an abstract class given that in some cases an array is not required at all (reverse mapping for example). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4830) Sorter API: use an abstract doc map instead of an array
[ https://issues.apache.org/jira/browse/LUCENE-4830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603215#comment-13603215 ] Commit Tag Bot commented on LUCENE-4830: [branch_4x commit] Adrien Grand http://svn.apache.org/viewvc?view=revisionrevision=1456797 LUCENE-4830: Add missing @Override (merged from r1456796). Sorter API: use an abstract doc map instead of an array --- Key: LUCENE-4830 URL: https://issues.apache.org/jira/browse/LUCENE-4830 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Trivial Fix For: 4.3 Attachments: LUCENE-4830.patch The sorter API uses arrays to store the old-new and new-old doc IDs mappings. It should rather be an abstract class given that in some cases an array is not required at all (reverse mapping for example). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3706) Ship setup to log with log4j.
[ https://issues.apache.org/jira/browse/SOLR-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McKinley updated SOLR-3706: Attachment: SOLR-3706-solr-log4j.patch This patch switches things to log4j I'm not wild about adding log4j to the .war, but that is the closest to what we currently do with JUL Docs and sample configs would need to be updated too... Ship setup to log with log4j. - Key: SOLR-3706 URL: https://issues.apache.org/jira/browse/SOLR-3706 Project: Solr Issue Type: Improvement Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.3, 5.0 Attachments: SOLR-3706-solr-log4j.patch Currently we default to java util logging and it's terrible in my opinion. *It's simple built in logger is a 2 line logger. *You have to jump through hoops to use your own custom formatter with jetty - either putting your class in the start.jar or other pain in the butt solutions. *It can't roll files by date out of the box. I'm sure there are more issues, but those are the ones annoying me now. We should switch to log4j - it's much nicer and it's easy to get a nice single line format and roll by date, etc. If someone wants to use JUL they still can - but at least users could start with something decent. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-3358) Capture Logging Events from JUL and Log4j
[ https://issues.apache.org/jira/browse/SOLR-3358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McKinley resolved SOLR-3358. - Resolution: Duplicate Closing this issue to be included in SOLR-3706 Capture Logging Events from JUL and Log4j - Key: SOLR-3358 URL: https://issues.apache.org/jira/browse/SOLR-3358 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Attachments: SOLR-3358-compile-path.patch, SOLR-3358-logging.patch, SOLR-3358-logging.patch The UI should be able to show the last few log messages. To support this, we will need to register an Appender (log4j) or Handler (JUL) and keep a buffer of recent log events. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-3399) distribute/assume log4j logging rather then JUL
[ https://issues.apache.org/jira/browse/SOLR-3399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McKinley resolved SOLR-3399. - Resolution: Duplicate distribute/assume log4j logging rather then JUL --- Key: SOLR-3399 URL: https://issues.apache.org/jira/browse/SOLR-3399 Project: Solr Issue Type: Improvement Reporter: Ryan McKinley Assignee: Ryan McKinley Priority: Minor The discussion on SOLR-3358 has many threads, so I will break this out in its own issue. Currently we use SLF4j to define logging and the war file distributes the the JUL binding. To improve the out-of-the-box logging experience, I think we should switch to log4j. I suggest we: * keep using SLF4J (especially in solrj) * replace the JUL log watcher with a log4j version * this will let us have the admin UI logging stuff work against a single Appender rather then the root loggers -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-3979) slf4j bindings other than jdk -- cannot change log levels
[ https://issues.apache.org/jira/browse/SOLR-3979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McKinley resolved SOLR-3979. - Resolution: Duplicate check SOLR-3706 slf4j bindings other than jdk -- cannot change log levels - Key: SOLR-3979 URL: https://issues.apache.org/jira/browse/SOLR-3979 Project: Solr Issue Type: Bug Affects Versions: 4.0 Reporter: Shawn Heisey Fix For: 4.3 Attachments: log4j-solr-stuff.zip Once I finally got log4j logging working, I was slightly surprised by the message related to SOLR-3426. I did not really consider that to be a big deal, because if I want to look at my log, I'll be on the commandline anyway. I was even more surprised to find that I cannot change any of the log levels from the admin gui. My default log level is WARN for performance reasons, but every once in a while I like to bump the log level to INFO to troubleshoot a specific problem, then turn it back down. This is very easy with jdk logging in either 3.x or 4.0. I changed to log4j because it easily allows me to put the date of a log message on the same line as the first line of the actual log message, so when I grep for things, I have the timestamp in the grep output. Currently the only way for me to change my log level is by updating log4j.properties and restarting Solr. If the capability to figure this out on a class-by-class basis isn't there with log4j, I would at least like to be able to set the root logging level. Is that possible? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-4129) Solr UI doesn't support log4j
[ https://issues.apache.org/jira/browse/SOLR-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McKinley resolved SOLR-4129. - Resolution: Duplicate consolidating log4j issues Solr UI doesn't support log4j -- Key: SOLR-4129 URL: https://issues.apache.org/jira/browse/SOLR-4129 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0 Reporter: Raintung Li Labels: log Fix For: 4.3 Attachments: patch-4129.txt For many project use the log4j, actually solr use slf logger framework, slf can easy to integrate with log4j by design. Solr use log4j-over-slf.jar to handle log4j case. This jar has some issues. a. Actually last invoke slf to print the logger (For solr it is JDK14.logging). b. Not implement all log4j function. ex. Logger.setLevel() c. JDK14 log miss some function, ex. thread.info, day rolling Some dependence project had been used log4j that the customer still want to use it. JDK14 log has many different with Log4j, at least configuration file can't reuse. The bad thing is log4j-over-slf.jar conflict with log4j. If use solr, the other project have to remove log4j. I think it shouldn't use log4j-over-slf.jar, still reuse log4j if customer want to use it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4834) Sorter API: Make TermsEnum.docs accept any source of liveDocs
Adrien Grand created LUCENE-4834: Summary: Sorter API: Make TermsEnum.docs accept any source of liveDocs Key: LUCENE-4834 URL: https://issues.apache.org/jira/browse/LUCENE-4834 Project: Lucene - Core Issue Type: Bug Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.3 TermsEnum.docs currently only works when liveDocs is null or the reader's liveDocs. This is enough for addIndexes but it would be cleaner to follow TermsEnum.docs contract and accept any source of liveDocs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4834) Sorter API: Make TermsEnum.docs accept any source of liveDocs
[ https://issues.apache.org/jira/browse/LUCENE-4834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-4834: - Attachment: LUCENE-4834.patch Patch. I'll commit soon. Sorter API: Make TermsEnum.docs accept any source of liveDocs - Key: LUCENE-4834 URL: https://issues.apache.org/jira/browse/LUCENE-4834 Project: Lucene - Core Issue Type: Bug Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.3 Attachments: LUCENE-4834.patch TermsEnum.docs currently only works when liveDocs is null or the reader's liveDocs. This is enough for addIndexes but it would be cleaner to follow TermsEnum.docs contract and accept any source of liveDocs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3369) shards.tolerant=true broken on group and facet queries
[ https://issues.apache.org/jira/browse/SOLR-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603244#comment-13603244 ] Ferry Landzaat commented on SOLR-3369: -- Is there any plan to fix this issue? We want to upgrade from 3.x and really need this patch to make the system reliable. shards.tolerant=true broken on group and facet queries -- Key: SOLR-3369 URL: https://issues.apache.org/jira/browse/SOLR-3369 Project: Solr Issue Type: Bug Components: search Affects Versions: 4.0-ALPHA Environment: Distributed environment (shards) Reporter: Russell Black Labels: patch Attachments: SOLR-3369-shards-tolerant.patch In a distributed environment, shards.tolerant=true allows for partial results to be returned when individual shards are down. For group=true and facet=true queries, using this feature results in an error when shards are down. This patch allows users to use the shard tolerance feature with facet and grouping queries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4834) Sorter API: Make TermsEnum.docs accept any source of liveDocs
[ https://issues.apache.org/jira/browse/LUCENE-4834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603264#comment-13603264 ] Shai Erera commented on LUCENE-4834: Looks good! +1 to commit Sorter API: Make TermsEnum.docs accept any source of liveDocs - Key: LUCENE-4834 URL: https://issues.apache.org/jira/browse/LUCENE-4834 Project: Lucene - Core Issue Type: Bug Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.3 Attachments: LUCENE-4834.patch TermsEnum.docs currently only works when liveDocs is null or the reader's liveDocs. This is enough for addIndexes but it would be cleaner to follow TermsEnum.docs contract and accept any source of liveDocs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4834) Sorter API: Make TermsEnum.docs accept any source of liveDocs
[ https://issues.apache.org/jira/browse/LUCENE-4834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603266#comment-13603266 ] Commit Tag Bot commented on LUCENE-4834: [trunk commit] Adrien Grand http://svn.apache.org/viewvc?view=revisionrevision=1456842 LUCENE-4834: Sorter API: Make TermsEnum.docs accept any source of liveDocs. Sorter API: Make TermsEnum.docs accept any source of liveDocs - Key: LUCENE-4834 URL: https://issues.apache.org/jira/browse/LUCENE-4834 Project: Lucene - Core Issue Type: Bug Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.3 Attachments: LUCENE-4834.patch TermsEnum.docs currently only works when liveDocs is null or the reader's liveDocs. This is enough for addIndexes but it would be cleaner to follow TermsEnum.docs contract and accept any source of liveDocs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4752) Merge segments to sort them
[ https://issues.apache.org/jira/browse/LUCENE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603271#comment-13603271 ] Shai Erera commented on LUCENE-4752: This looks great. I think that it's fine that we let people override SegmentMerger ... it's super expert API, no sane person would ever want to do that, but those that want to, it's good to have the option. Adrien, perhaps add a SortingSegmentMerger to the sorter package? Or at least add a test that verifies merges keep things sorted? Merge segments to sort them --- Key: LUCENE-4752 URL: https://issues.apache.org/jira/browse/LUCENE-4752 Project: Lucene - Core Issue Type: New Feature Components: core/index Reporter: David Smiley Assignee: Adrien Grand Attachments: LUCENE-4752.patch It would be awesome if Lucene could write the documents out in a segment based on a configurable order. This of course applies to merging segments to. The benefit is increased locality on disk of documents that are likely to be accessed together. This often applies to documents near each other in time, but also spatially. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4834) Sorter API: Make TermsEnum.docs accept any source of liveDocs
[ https://issues.apache.org/jira/browse/LUCENE-4834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603273#comment-13603273 ] Commit Tag Bot commented on LUCENE-4834: [branch_4x commit] Adrien Grand http://svn.apache.org/viewvc?view=revisionrevision=1456851 LUCENE-4834: Sorter API: Make TermsEnum.docs accept any source of liveDocs (merged from r1456842). Sorter API: Make TermsEnum.docs accept any source of liveDocs - Key: LUCENE-4834 URL: https://issues.apache.org/jira/browse/LUCENE-4834 Project: Lucene - Core Issue Type: Bug Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.3 Attachments: LUCENE-4834.patch TermsEnum.docs currently only works when liveDocs is null or the reader's liveDocs. This is enough for addIndexes but it would be cleaner to follow TermsEnum.docs contract and accept any source of liveDocs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4834) Sorter API: Make TermsEnum.docs accept any source of liveDocs
[ https://issues.apache.org/jira/browse/LUCENE-4834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-4834. -- Resolution: Fixed Thanks Shai. Sorter API: Make TermsEnum.docs accept any source of liveDocs - Key: LUCENE-4834 URL: https://issues.apache.org/jira/browse/LUCENE-4834 Project: Lucene - Core Issue Type: Bug Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.3 Attachments: LUCENE-4834.patch TermsEnum.docs currently only works when liveDocs is null or the reader's liveDocs. This is enough for addIndexes but it would be cleaner to follow TermsEnum.docs contract and accept any source of liveDocs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4795) Add FacetsCollector based on SortedSetDocValues
[ https://issues.apache.org/jira/browse/LUCENE-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603299#comment-13603299 ] Michael McCandless commented on LUCENE-4795: bq. Then perhaps just drop a comment in the ctor? OK I'll put a comment where I append w/ the delim explaining why I can't use CP.toString ... Thanks Shai! I'll commit soon... Add FacetsCollector based on SortedSetDocValues --- Key: LUCENE-4795 URL: https://issues.apache.org/jira/browse/LUCENE-4795 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Michael McCandless Assignee: Michael McCandless Attachments: LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, pleaseBenchmarkMe.patch Recently (LUCENE-4765) we added multi-valued DocValues field (SortedSetDocValuesField), and this can be used for faceting in Solr (SOLR-4490). I think we should also add support in the facet module? It'd be an option with different tradeoffs. Eg, it wouldn't require the taxonomy index, since the main index handles label/ord resolving. There are at least two possible approaches: * On every reopen, build the seg - global ord map, and then on every collect, get the seg ord, map it to the global ord space, and increment counts. This adds cost during reopen in proportion to number of unique terms ... * On every collect, increment counts based on the seg ords, and then do a merge in the end just like distributed faceting does. The first approach is much easier so I built a quick prototype using that. The prototype does the counting, but it does NOT do the top K facets gathering in the end, and it doesn't know parent/child ord relationships, so there's tons more to do before this is real. I also was unsure how to properly integrate it since the existing classes seem to expect that you use a taxonomy index to resolve ords. I ran a quick performance test. base = trunk except I disabled the compute top-K in FacetsAccumulator to make the comparison fair; comp = using the prototype collector in the patch: {noformat} TaskQPS base StdDevQPS comp StdDev Pct diff OrHighLow 18.79 (2.5%) 14.36 (3.3%) -23.6% ( -28% - -18%) HighTerm 21.58 (2.4%) 16.53 (3.7%) -23.4% ( -28% - -17%) OrHighMed 18.20 (2.5%) 13.99 (3.3%) -23.2% ( -28% - -17%) Prefix3 14.37 (1.5%) 11.62 (3.5%) -19.1% ( -23% - -14%) LowTerm 130.80 (1.6%) 106.95 (2.4%) -18.2% ( -21% - -14%) OrHighHigh9.60 (2.6%)7.88 (3.5%) -17.9% ( -23% - -12%) AndHighHigh 24.61 (0.7%) 20.74 (1.9%) -15.7% ( -18% - -13%) Fuzzy1 49.40 (2.5%) 43.48 (1.9%) -12.0% ( -15% - -7%) MedSloppyPhrase 27.06 (1.6%) 23.95 (2.3%) -11.5% ( -15% - -7%) MedTerm 51.43 (2.0%) 46.21 (2.7%) -10.2% ( -14% - -5%) IntNRQ4.02 (1.6%)3.63 (4.0%) -9.7% ( -15% - -4%) Wildcard 29.14 (1.5%) 26.46 (2.5%) -9.2% ( -13% - -5%) HighSloppyPhrase0.92 (4.5%)0.87 (5.8%) -5.4% ( -15% -5%) MedSpanNear 29.51 (2.5%) 27.94 (2.2%) -5.3% ( -9% -0%) HighSpanNear3.55 (2.4%)3.38 (2.0%) -4.9% ( -9% -0%) AndHighMed 108.34 (0.9%) 104.55 (1.1%) -3.5% ( -5% - -1%) LowSloppyPhrase 20.50 (2.0%) 20.09 (4.2%) -2.0% ( -8% -4%) LowPhrase 21.60 (6.0%) 21.26 (5.1%) -1.6% ( -11% - 10%) Fuzzy2 53.16 (3.9%) 52.40 (2.7%) -1.4% ( -7% -5%) LowSpanNear8.42 (3.2%)8.45 (3.0%) 0.3% ( -5% -6%) Respell 45.17 (4.3%) 45.38 (4.4%) 0.5% ( -7% -9%) MedPhrase 113.93 (5.8%) 115.02 (4.9%) 1.0% ( -9% - 12%) AndHighLow 596.42 (2.5%) 617.12 (2.8%) 3.5% ( -1% -8%) HighPhrase 17.30 (10.5%) 18.36 (9.1%) 6.2% ( -12% - 28%) {noformat} I'm impressed that this approach is only ~24% slower in the worst case! I think this
[jira] [Commented] (LUCENE-4832) Unbounded getTopGroups for ToParentBlockJoinCollector
[ https://issues.apache.org/jira/browse/LUCENE-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603313#comment-13603313 ] Aleksey Aleev commented on LUCENE-4832: --- Michael, thank you for replying! I agree with you about Integer.MAX_VALUE, it looks much better in this way. The reason why I introduced GroupDocsAccumulator class is that I wanted to reduce the size of getTopGroups(...) method and make it more readable. Could you please tell me you don't like introducing new class and creating an instance of it at all or you think that it's not clear what accumulate() method should do? Maybe it will be more clear if the loop by groups will remain in getTopGroups() and the loop's body will be extracted in accumulate() method? So we'll have: {code:java} for(int groupIDX=offset;groupIDXsortedGroups.length;groupIDX++) { groupDocsAccumulator.accumulate(groupIDX); } {code} Please tell WDYT about it and I'll update the patch. Unbounded getTopGroups for ToParentBlockJoinCollector - Key: LUCENE-4832 URL: https://issues.apache.org/jira/browse/LUCENE-4832 Project: Lucene - Core Issue Type: Improvement Components: modules/join Reporter: Aleksey Aleev Attachments: LUCENE-4832.patch _ToParentBlockJoinCollector#getTopGroups_ method takes several arguments: {code:java} public TopGroupsInteger getTopGroups(ToParentBlockJoinQuery query, Sort withinGroupSort, int offset, int maxDocsPerGroup, int withinGroupOffset, boolean fillSortFields) {code} and one of them is {{maxDocsPerGroup}} which specifies upper bound of child documents number returned within each group. {{ToParentBlockJoinCollector}} collects and caches all child documents matched by given {{ToParentBlockJoinQuery}} in {{OneGroup}} objects during search so it is possible to create {{GroupDocs}} with all matched child documents instead of part of them bounded by {{maxDocsPerGroup}}. When you specify {{maxDocsPerGroup}} new queues(I mean {{TopScoreDocCollector}}/{{TopFieldCollector}}) will be created for each group with {{maxDocsPerGroup}} objects created within each queue which could lead to redundant memory allocation in case of child documents number within group is less than {{maxDocsPerGroup}}. I suppose that there are many cases where you need to get all child documents matched by query so it could be nice to have ability to get top groups with all matched child documents without unnecessary memory allocation. Possible solution is to pass negative {{maxDocsPerGroup}} in case when you need to get all matched child documents within each group and check {{maxDocsPerGroup}} value: if it is negative then we need to create queue with size of matched child documents number; otherwise create queue with size equals to {{maxDocsPerGroup}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Stress tests
The OpenCloseCoreStressTest is occasionally failing, I was looking at two of them last night and I'm wondering if there's an issue with warming searchers or perhaps background merging. I've been rather assuming that core.close() waits until all that is done, but I don't have any proof of that. Is there a good way to find out if a searcher for a particular SolrCore is in process of autowarming or if there's a background merge going on? I might try preventing opens/closes if a background thread is warming a core. I already prevent load/unload/reload operations from occurring simultaneously, but I'm wondering if I'm running afoul of either 1 background merging 2 autowarming This will be an ugly one to debug, so any thoughts welcome Thanks, Erick
[JENKINS] Lucene-trunk-Linux-Java7-64-test-only - Build # 26880 - Failure!
Build: builds.flonkings.com/job/Lucene-trunk-Linux-Java7-64-test-only/26880/ 1 tests failed. REGRESSION: org.apache.lucene.search.TestTimeLimitingCollector.testSearchMultiThreaded Error Message: Captured an uncaught exception in thread: Thread[id=95, name=Thread-59, state=RUNNABLE, group=TGRP-TestTimeLimitingCollector] Stack Trace: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=95, name=Thread-59, state=RUNNABLE, group=TGRP-TestTimeLimitingCollector] Caused by: java.lang.OutOfMemoryError: Java heap space at __randomizedtesting.SeedInfo.seed([4AB0D0AAEC8C1237]:0) at org.apache.lucene.codecs.sep.SepPostingsReader.readTermsBlock(SepPostingsReader.java:191) at org.apache.lucene.codecs.pulsing.PulsingPostingsReader.readTermsBlock(PulsingPostingsReader.java:135) at org.apache.lucene.codecs.blockterms.BlockTermsReader$FieldReader$SegmentTermsEnum.nextBlock(BlockTermsReader.java:832) at org.apache.lucene.codecs.blockterms.BlockTermsReader$FieldReader$SegmentTermsEnum._next(BlockTermsReader.java:659) at org.apache.lucene.codecs.blockterms.BlockTermsReader$FieldReader$SegmentTermsEnum.next(BlockTermsReader.java:651) at org.apache.lucene.index.MultiTermsEnum.reset(MultiTermsEnum.java:128) at org.apache.lucene.index.MultiTerms.iterator(MultiTerms.java:110) at org.apache.lucene.search.TermQuery$TermWeight.getTermsEnum(TermQuery.java:101) at org.apache.lucene.search.TermQuery$TermWeight.scorer(TermQuery.java:81) at org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:308) at org.apache.lucene.search.AssertingIndexSearcher$1.scorer(AssertingIndexSearcher.java:80) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:596) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:302) at org.apache.lucene.search.TestTimeLimitingCollector.search(TestTimeLimitingCollector.java:124) at org.apache.lucene.search.TestTimeLimitingCollector.doTestSearch(TestTimeLimitingCollector.java:139) at org.apache.lucene.search.TestTimeLimitingCollector.access$200(TestTimeLimitingCollector.java:42) at org.apache.lucene.search.TestTimeLimitingCollector$1.run(TestTimeLimitingCollector.java:292) Build Log: [...truncated 1203 lines...] [junit4:junit4] Suite: org.apache.lucene.search.TestTimeLimitingCollector [junit4:junit4] 2 NOTE: reproduce with: ant test -Dtestcase=TestTimeLimitingCollector -Dtests.method=testSearchMultiThreaded -Dtests.seed=4AB0D0AAEC8C1237 -Dtests.slow=true -Dtests.locale=es_VE -Dtests.timezone=Europe/Oslo -Dtests.file.encoding=ISO-8859-1 [junit4:junit4] ERROR246s J1 | TestTimeLimitingCollector.testSearchMultiThreaded [junit4:junit4] Throwable #1: java.lang.AssertionError: some threads failed! expected:50 but was:32 [junit4:junit4]at org.junit.Assert.fail(Assert.java:93) [junit4:junit4]at org.junit.Assert.failNotEquals(Assert.java:647) [junit4:junit4]at org.junit.Assert.assertEquals(Assert.java:128) [junit4:junit4]at org.junit.Assert.assertEquals(Assert.java:472) [junit4:junit4]at org.apache.lucene.search.TestTimeLimitingCollector.doTestMultiThreads(TestTimeLimitingCollector.java:306) [junit4:junit4]at org.apache.lucene.search.TestTimeLimitingCollector.testSearchMultiThreaded(TestTimeLimitingCollector.java:271) [junit4:junit4]at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit4:junit4]at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) [junit4:junit4]at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit4:junit4]at java.lang.reflect.Method.invoke(Method.java:601) [junit4:junit4]at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) [junit4:junit4]at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) [junit4:junit4]at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) [junit4:junit4]at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) [junit4:junit4]at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) [junit4:junit4]at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) [junit4:junit4]at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) [junit4:junit4]at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) [junit4:junit4]at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) [junit4:junit4]at
[jira] [Commented] (SOLR-4586) Remove maxBooleanClauses from Solr
[ https://issues.apache.org/jira/browse/SOLR-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603338#comment-13603338 ] Mikhail Khludnev commented on SOLR-4586: bq. maxBooleanClauses no longer applies, that the limitation was removed from Lucene sometime in the 3.x series. [really?|https://github.com/apache/lucene-solr/blob/trunk/lucene/core/src/java/org/apache/lucene/search/BooleanQuery.java#L137] Remove maxBooleanClauses from Solr -- Key: SOLR-4586 URL: https://issues.apache.org/jira/browse/SOLR-4586 Project: Solr Issue Type: Improvement Affects Versions: 4.2 Reporter: Shawn Heisey Attachments: SOLR-4586.patch In the #solr IRC channel, I mentioned the maxBooleanClauses limitation to someone asking a question about queries. Mark Miller told me that maxBooleanClauses no longer applies, that the limitation was removed from Lucene sometime in the 3.x series. The config still shows up in the example even in the just-released 4.2. Checking through the source code, I found that the config option is parsed and the value stored in objects, but does not actually seem to be used by anything. I removed every trace of it that I could find, and all tests still pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4196) Untangle XML-specific nature of Config and Container classes
[ https://issues.apache.org/jira/browse/SOLR-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603341#comment-13603341 ] Commit Tag Bot commented on SOLR-4196: -- [trunk commit] Erick Erickson http://svn.apache.org/viewvc?view=revisionrevision=1456938 Added comments for deprecating solr.xml (SOLR-4196 etc) Untangle XML-specific nature of Config and Container classes Key: SOLR-4196 URL: https://issues.apache.org/jira/browse/SOLR-4196 Project: Solr Issue Type: Improvement Components: Schema and Analysis Reporter: Erick Erickson Assignee: Erick Erickson Priority: Minor Fix For: 4.3, 5.0 Attachments: SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, StressTest.zip, StressTest.zip, StressTest.zip, StressTest.zip sub-task for SOLR-4083. If we're going to try to obsolete solr.xml, we need to pull all of the specific XML processing out of Config and Container. Currently, we refer to xpaths all over the place. This JIRA is about providing a thunking layer to isolate the XML-esque nature of solr.xml and allow a simple properties file to be used instead which will lead, eventually, to solr.xml going away. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: questions about solr.xml/solr.properties
Added some bits to CHANGES.txt for SOLR-4196 On Thu, Mar 14, 2013 at 10:01 PM, Erick Erickson erickerick...@gmail.comwrote: OK, I'll see what I can put in tomorrow. It won't be comprehensive, probably just refer to the Wiki page after a very brief explanation. On Thu, Mar 14, 2013 at 9:45 PM, Mark Miller markrmil...@gmail.comwrote: Okay - leaving it out on purpose can get kind of confusing - someone that wanted to look at the state of trunk right now might think, oh, only bug fixes and very minor changes, but surprise, there is actually a major structural change. I think we should try and keep CHANGES up to date with reality for our 'trunk', '4x' users. - Mark On Mar 14, 2013, at 9:24 PM, Erick Erickson erickerick...@gmail.com wrote: bq: Is there any mention of this in CHANGES yet Nope, it's one of the JIRAs I've assigned to myself. SOLR-4542. I have started a Wiki page here: http://wiki.apache.org/solr/Core%20Discovery%20%284.3%20and%20beyond%29 linked to from here: http://wiki.apache.org/solr/CoreAdmin#Configuration But I've been waiting for the dust to settle before fleshing this out much. Although the more exposure it gets, I suppose the more chance people will have to comment on it. If we're agreed that solr.properties is the way to go, then I'll put something in CHANGES Real Soon Now and perhaps let the Wiki page evolve in fits and starts. On Thu, Mar 14, 2013 at 8:43 PM, Mark Miller markrmil...@gmail.com wrote: Is there any mention of this in CHANGES yet erick? Was just browsing for it… - Mark On Mar 14, 2013, at 6:37 PM, Jan Høydahl jan@cominvent.com wrote: solr.yml :-) -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com 14. mars 2013 kl. 22:02 skrev Yonik Seeley yo...@lucidworks.com: On Thu, Mar 14, 2013 at 3:46 PM, Robert Muir rcm...@gmail.com wrote: It seems to me there are two changes involved: 1. ability to auto-discover cores from the filesystem so you don't need to explicitly list them 2. changing .xml format to .properties These are indeed completely independent. My main concern/goal in this area has been #1. I assume #2 is just because developer tastes have been shifting away from XML, but like you I worry about what happens for config that needs more structure. -Yonik http://lucidworks.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4196) Untangle XML-specific nature of Config and Container classes
[ https://issues.apache.org/jira/browse/SOLR-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603356#comment-13603356 ] Commit Tag Bot commented on SOLR-4196: -- [branch_4x commit] Erick Erickson http://svn.apache.org/viewvc?view=revisionrevision=1456941 Added comments for deprecating solr.xml (SOLR-4196 etc) Untangle XML-specific nature of Config and Container classes Key: SOLR-4196 URL: https://issues.apache.org/jira/browse/SOLR-4196 Project: Solr Issue Type: Improvement Components: Schema and Analysis Reporter: Erick Erickson Assignee: Erick Erickson Priority: Minor Fix For: 4.3, 5.0 Attachments: SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, StressTest.zip, StressTest.zip, StressTest.zip, StressTest.zip sub-task for SOLR-4083. If we're going to try to obsolete solr.xml, we need to pull all of the specific XML processing out of Config and Container. Currently, we refer to xpaths all over the place. This JIRA is about providing a thunking layer to isolate the XML-esque nature of solr.xml and allow a simple properties file to be used instead which will lead, eventually, to solr.xml going away. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4588) Partial Update of Poly Field Corrupts Data
[ https://issues.apache.org/jira/browse/SOLR-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Crygier updated SOLR-4588: --- Attachment: schema.xml Sample Schema Partial Update of Poly Field Corrupts Data -- Key: SOLR-4588 URL: https://issues.apache.org/jira/browse/SOLR-4588 Project: Solr Issue Type: Bug Affects Versions: 4.0 Reporter: John Crygier Attachments: schema.xml When updating a field that is a poly type (Testing with LatLonType), when you do a partial document update, the poly fields will become multi-valued. This occurs even when the field is configured to not be multi-valued. Test Case Use the following schema: schema name='JohnTest' version='1.5' fields field name='id' type='String' indexed='true' stored='true' required='true' multiValued='false' / field name='_version_' type='int' indexed='true' stored='true' required='false' multiValued='false' / dynamicField name='*LatLon' type='location' indexed='true' stored='true' required='false' multiValued='false' / dynamicField name='*_coordinate' type='int' indexed='true' stored='true' required='false' multiValued='false' / /fields uniqueKeyid/uniqueKey types fieldType sortMissingLast='true' name='String' class='solr.StrField' / fieldType name=int class=solr.TrieIntField precisionStep=0 positionIncrementGap=0/ fieldType name=location class=solr.LatLonType subFieldSuffix=_coordinate/ /types /schema And issue the following commands (With responses): curl 'localhost:8983/solr/update?commit=true' -H 'Content-type:application/json' -d '[{id:JohnTestDocument, JohnTestLatLon : 0,0}]' RESPONSE: {responseHeader:{status:0,QTime:2133}} curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true' RESPONSE: { responseHeader:{ status:0, QTime:2, params:{ indent:true, q:*:*, wt:json}}, response:{numFound:1,start:0,docs:[ { id:JohnTestDocument, JohnTestLatLon_0_coordinate:0.0, JohnTestLatLon_1_coordinate:0.0, JohnTestLatLon:0,0, _version_:-1596981248}] }} curl 'localhost:8983/solr/update?commit=true' -H 'Content-type:application/json' -d '[{id:JohnTestDocument,JohnTestLatLon:{set:5,7}}]' RESPONSE: {responseHeader:{status:0,QTime:218}} curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true' RESPONSE: { responseHeader:{ status:0, QTime:2, params:{ indent:true, q:*:*, wt:json}}, response:{numFound:1,start:0,docs:[ { id:JohnTestDocument, JohnTestLatLon_0_coordinate:[0.0, 5.0], JohnTestLatLon_1_coordinate:[0.0, 7.0], JohnTestLatLon:5,7, _version_:-118489088}] }} As you can see, the 0.0 hangs around in JohnTestLatLon_0_coordinate and JohnTestLatLon_1_coordinate. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4588) Partial Update of Poly Field Corrupts Data
John Crygier created SOLR-4588: -- Summary: Partial Update of Poly Field Corrupts Data Key: SOLR-4588 URL: https://issues.apache.org/jira/browse/SOLR-4588 Project: Solr Issue Type: Bug Affects Versions: 4.0 Reporter: John Crygier Attachments: schema.xml When updating a field that is a poly type (Testing with LatLonType), when you do a partial document update, the poly fields will become multi-valued. This occurs even when the field is configured to not be multi-valued. Test Case Use the following schema: schema name='JohnTest' version='1.5' fields field name='id' type='String' indexed='true' stored='true' required='true' multiValued='false' / field name='_version_' type='int' indexed='true' stored='true' required='false' multiValued='false' / dynamicField name='*LatLon' type='location' indexed='true' stored='true' required='false' multiValued='false' / dynamicField name='*_coordinate' type='int' indexed='true' stored='true' required='false' multiValued='false' / /fields uniqueKeyid/uniqueKey types fieldType sortMissingLast='true' name='String' class='solr.StrField' / fieldType name=int class=solr.TrieIntField precisionStep=0 positionIncrementGap=0/ fieldType name=location class=solr.LatLonType subFieldSuffix=_coordinate/ /types /schema And issue the following commands (With responses): curl 'localhost:8983/solr/update?commit=true' -H 'Content-type:application/json' -d '[{id:JohnTestDocument, JohnTestLatLon : 0,0}]' RESPONSE: {responseHeader:{status:0,QTime:2133}} curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true' RESPONSE: { responseHeader:{ status:0, QTime:2, params:{ indent:true, q:*:*, wt:json}}, response:{numFound:1,start:0,docs:[ { id:JohnTestDocument, JohnTestLatLon_0_coordinate:0.0, JohnTestLatLon_1_coordinate:0.0, JohnTestLatLon:0,0, _version_:-1596981248}] }} curl 'localhost:8983/solr/update?commit=true' -H 'Content-type:application/json' -d '[{id:JohnTestDocument,JohnTestLatLon:{set:5,7}}]' RESPONSE: {responseHeader:{status:0,QTime:218}} curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true' RESPONSE: { responseHeader:{ status:0, QTime:2, params:{ indent:true, q:*:*, wt:json}}, response:{numFound:1,start:0,docs:[ { id:JohnTestDocument, JohnTestLatLon_0_coordinate:[0.0, 5.0], JohnTestLatLon_1_coordinate:[0.0, 7.0], JohnTestLatLon:5,7, _version_:-118489088}] }} As you can see, the 0.0 hangs around in JohnTestLatLon_0_coordinate and JohnTestLatLon_1_coordinate. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4588) Partial Update of Poly Field Corrupts Data
[ https://issues.apache.org/jira/browse/SOLR-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Crygier updated SOLR-4588: --- Description: When updating a field that is a poly type (Testing with LatLonType), when you do a partial document update, the poly fields will become multi-valued. This occurs even when the field is configured to not be multi-valued. Test Case Use the attached schema (schema.xml) And issue the following commands (With responses): curl 'localhost:8983/solr/update?commit=true' -H 'Content-type:application/json' -d '[{id:JohnTestDocument, JohnTestLatLon : 0,0}]' RESPONSE: {responseHeader:{status:0,QTime:2133}} curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true' RESPONSE: { responseHeader:{ status:0, QTime:2, params:{ indent:true, q:*:*, wt:json}}, response:{numFound:1,start:0,docs:[ { id:JohnTestDocument, JohnTestLatLon_0_coordinate:0.0, JohnTestLatLon_1_coordinate:0.0, JohnTestLatLon:0,0, _version_:-1596981248}] }} curl 'localhost:8983/solr/update?commit=true' -H 'Content-type:application/json' -d '[{id:JohnTestDocument,JohnTestLatLon:{set:5,7}}]' RESPONSE: {responseHeader:{status:0,QTime:218}} curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true' RESPONSE: { responseHeader:{ status:0, QTime:2, params:{ indent:true, q:*:*, wt:json}}, response:{numFound:1,start:0,docs:[ { id:JohnTestDocument, JohnTestLatLon_0_coordinate:[0.0, 5.0], JohnTestLatLon_1_coordinate:[0.0, 7.0], JohnTestLatLon:5,7, _version_:-118489088}] }} As you can see, the 0.0 hangs around in JohnTestLatLon_0_coordinate and JohnTestLatLon_1_coordinate. was: When updating a field that is a poly type (Testing with LatLonType), when you do a partial document update, the poly fields will become multi-valued. This occurs even when the field is configured to not be multi-valued. Test Case Use the following schema: schema name='JohnTest' version='1.5' fields field name='id' type='String' indexed='true' stored='true' required='true' multiValued='false' / field name='_version_' type='int' indexed='true' stored='true' required='false' multiValued='false' / dynamicField name='*LatLon' type='location' indexed='true' stored='true' required='false' multiValued='false' / dynamicField name='*_coordinate' type='int' indexed='true' stored='true' required='false' multiValued='false' / /fields uniqueKeyid/uniqueKey types fieldType sortMissingLast='true' name='String' class='solr.StrField' / fieldType name=int class=solr.TrieIntField precisionStep=0 positionIncrementGap=0/ fieldType name=location class=solr.LatLonType subFieldSuffix=_coordinate/ /types /schema And issue the following commands (With responses): curl 'localhost:8983/solr/update?commit=true' -H 'Content-type:application/json' -d '[{id:JohnTestDocument, JohnTestLatLon : 0,0}]' RESPONSE: {responseHeader:{status:0,QTime:2133}} curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true' RESPONSE: { responseHeader:{ status:0, QTime:2, params:{ indent:true, q:*:*, wt:json}}, response:{numFound:1,start:0,docs:[ { id:JohnTestDocument, JohnTestLatLon_0_coordinate:0.0, JohnTestLatLon_1_coordinate:0.0, JohnTestLatLon:0,0, _version_:-1596981248}] }} curl 'localhost:8983/solr/update?commit=true' -H 'Content-type:application/json' -d '[{id:JohnTestDocument,JohnTestLatLon:{set:5,7}}]' RESPONSE: {responseHeader:{status:0,QTime:218}} curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true' RESPONSE: { responseHeader:{ status:0, QTime:2, params:{ indent:true, q:*:*, wt:json}}, response:{numFound:1,start:0,docs:[ { id:JohnTestDocument, JohnTestLatLon_0_coordinate:[0.0, 5.0], JohnTestLatLon_1_coordinate:[0.0, 7.0], JohnTestLatLon:5,7, _version_:-118489088}] }} As you can see, the 0.0 hangs around in JohnTestLatLon_0_coordinate and JohnTestLatLon_1_coordinate. Partial Update of Poly Field Corrupts Data -- Key: SOLR-4588 URL: https://issues.apache.org/jira/browse/SOLR-4588 Project: Solr Issue Type: Bug Affects Versions: 4.0 Reporter: John Crygier Attachments: schema.xml When updating a field that is a poly type (Testing with LatLonType), when you do a partial document update, the poly fields will become multi-valued. This occurs even when the field is configured to not
[jira] [Updated] (SOLR-4588) Partial Update of Poly Field Corrupts Data
[ https://issues.apache.org/jira/browse/SOLR-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Crygier updated SOLR-4588: --- Priority: Minor (was: Major) Partial Update of Poly Field Corrupts Data -- Key: SOLR-4588 URL: https://issues.apache.org/jira/browse/SOLR-4588 Project: Solr Issue Type: Bug Affects Versions: 4.0 Reporter: John Crygier Priority: Minor Attachments: schema.xml When updating a field that is a poly type (Testing with LatLonType), when you do a partial document update, the poly fields will become multi-valued. This occurs even when the field is configured to not be multi-valued. Test Case Use the attached schema (schema.xml) And issue the following commands (With responses): curl 'localhost:8983/solr/update?commit=true' -H 'Content-type:application/json' -d '[{id:JohnTestDocument, JohnTestLatLon : 0,0}]' RESPONSE: {responseHeader:{status:0,QTime:2133}} curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true' RESPONSE: { responseHeader:{ status:0, QTime:2, params:{ indent:true, q:*:*, wt:json}}, response:{numFound:1,start:0,docs:[ { id:JohnTestDocument, JohnTestLatLon_0_coordinate:0.0, JohnTestLatLon_1_coordinate:0.0, JohnTestLatLon:0,0, _version_:-1596981248}] }} curl 'localhost:8983/solr/update?commit=true' -H 'Content-type:application/json' -d '[{id:JohnTestDocument,JohnTestLatLon:{set:5,7}}]' RESPONSE: {responseHeader:{status:0,QTime:218}} curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true' RESPONSE: { responseHeader:{ status:0, QTime:2, params:{ indent:true, q:*:*, wt:json}}, response:{numFound:1,start:0,docs:[ { id:JohnTestDocument, JohnTestLatLon_0_coordinate:[0.0, 5.0], JohnTestLatLon_1_coordinate:[0.0, 7.0], JohnTestLatLon:5,7, _version_:-118489088}] }} As you can see, the 0.0 hangs around in JohnTestLatLon_0_coordinate and JohnTestLatLon_1_coordinate. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: questions about solr.xml/solr.properties
By far, the bulk of the work was trying to un-entangle the solr.xml from CoreContainer, especially as far as the core tags were concerned and allow cores to come and go. No matter whether we decide on some other mechanism than solr.properties or not, altering how we do this should be much easier now. The work for rapidly opening/closing (and lazily loading) cores was enhanced as part of this restructuring, if I were going to do it all over again I'd probably break it into smaller chunks. But even if we change course, the opening/closing stuff won't be affected so we're probably OK there. As far as nesting is concerned, it seems like we can emulate by multi-level properties if necessary, at least for simple structuring, you can see that a little in the page I started here: http://wiki.apache.org/solr/Core%20Discovery%20%284.3%20and%20beyond%29, I did a straight-forward mapping from cores to a bunch of properties like cores.hostPort cores.adminPath etc. Now, all I'm saying with that is that simple structure is still possible, I'm not particularly wedded to the idea. Consider Log4J properties. (OK, maybe that isn't a great example, I find it more than a little confusing, but...). But part of the motivation for this is moving to the SolrCloud way of doing things. The present setup has assumptions built into it from single-core-only days that make dynamic adjustments hard. Just one example: CoreContainer.load essentially assumed that it was the only thread operating when loading cores. To allow cores to come and go I had to do quite a bit of coordinating work. If we extend this to cores coming and going in response to load, or cores/collections being created on-the-fly etc. I'm not sure solr.xml is going to adapt whereas system properties are a more automation-friendly way of doing things. If we're going to eventually expand/contract/dynamically have nodes come and go we probably need to be able to, essentially, define all our properties at run-time rather than have a static, edit-by-hand configuration file. All that said, I'm open to whatever consensus we build. It'll about break my heart to _undo_ code, but I'll survive somehow, partially consoled by the fact that actually reading the solr.properties file wasn't much of the work G... Erick On Fri, Mar 15, 2013 at 9:08 AM, Erick Erickson erickerick...@gmail.comwrote: Added some bits to CHANGES.txt for SOLR-4196 On Thu, Mar 14, 2013 at 10:01 PM, Erick Erickson erickerick...@gmail.comwrote: OK, I'll see what I can put in tomorrow. It won't be comprehensive, probably just refer to the Wiki page after a very brief explanation. On Thu, Mar 14, 2013 at 9:45 PM, Mark Miller markrmil...@gmail.comwrote: Okay - leaving it out on purpose can get kind of confusing - someone that wanted to look at the state of trunk right now might think, oh, only bug fixes and very minor changes, but surprise, there is actually a major structural change. I think we should try and keep CHANGES up to date with reality for our 'trunk', '4x' users. - Mark On Mar 14, 2013, at 9:24 PM, Erick Erickson erickerick...@gmail.com wrote: bq: Is there any mention of this in CHANGES yet Nope, it's one of the JIRAs I've assigned to myself. SOLR-4542. I have started a Wiki page here: http://wiki.apache.org/solr/Core%20Discovery%20%284.3%20and%20beyond%29 linked to from here: http://wiki.apache.org/solr/CoreAdmin#Configuration But I've been waiting for the dust to settle before fleshing this out much. Although the more exposure it gets, I suppose the more chance people will have to comment on it. If we're agreed that solr.properties is the way to go, then I'll put something in CHANGES Real Soon Now and perhaps let the Wiki page evolve in fits and starts. On Thu, Mar 14, 2013 at 8:43 PM, Mark Miller markrmil...@gmail.com wrote: Is there any mention of this in CHANGES yet erick? Was just browsing for it… - Mark On Mar 14, 2013, at 6:37 PM, Jan Høydahl jan@cominvent.com wrote: solr.yml :-) -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com 14. mars 2013 kl. 22:02 skrev Yonik Seeley yo...@lucidworks.com: On Thu, Mar 14, 2013 at 3:46 PM, Robert Muir rcm...@gmail.com wrote: It seems to me there are two changes involved: 1. ability to auto-discover cores from the filesystem so you don't need to explicitly list them 2. changing .xml format to .properties These are indeed completely independent. My main concern/goal in this area has been #1. I assume #2 is just because developer tastes have been shifting away from XML, but like you I worry about what happens for config that needs more structure. -Yonik http://lucidworks.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail:
[jira] [Commented] (SOLR-4588) Partial Update of Poly Field Corrupts Data
[ https://issues.apache.org/jira/browse/SOLR-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603373#comment-13603373 ] Erick Erickson commented on SOLR-4588: -- Does this happen on a recent build? I _think_ I remember something about this, it sounds related to https://issues.apache.org/jira/browse/SOLR-4134. Could you try it with 4.2 and let us know if it's still an issue? Partial Update of Poly Field Corrupts Data -- Key: SOLR-4588 URL: https://issues.apache.org/jira/browse/SOLR-4588 Project: Solr Issue Type: Bug Affects Versions: 4.0 Reporter: John Crygier Priority: Minor Attachments: schema.xml When updating a field that is a poly type (Testing with LatLonType), when you do a partial document update, the poly fields will become multi-valued. This occurs even when the field is configured to not be multi-valued. Test Case Use the attached schema (schema.xml) And issue the following commands (With responses): curl 'localhost:8983/solr/update?commit=true' -H 'Content-type:application/json' -d '[{id:JohnTestDocument, JohnTestLatLon : 0,0}]' RESPONSE: {responseHeader:{status:0,QTime:2133}} curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true' RESPONSE: { responseHeader:{ status:0, QTime:2, params:{ indent:true, q:*:*, wt:json}}, response:{numFound:1,start:0,docs:[ { id:JohnTestDocument, JohnTestLatLon_0_coordinate:0.0, JohnTestLatLon_1_coordinate:0.0, JohnTestLatLon:0,0, _version_:-1596981248}] }} curl 'localhost:8983/solr/update?commit=true' -H 'Content-type:application/json' -d '[{id:JohnTestDocument,JohnTestLatLon:{set:5,7}}]' RESPONSE: {responseHeader:{status:0,QTime:218}} curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true' RESPONSE: { responseHeader:{ status:0, QTime:2, params:{ indent:true, q:*:*, wt:json}}, response:{numFound:1,start:0,docs:[ { id:JohnTestDocument, JohnTestLatLon_0_coordinate:[0.0, 5.0], JohnTestLatLon_1_coordinate:[0.0, 7.0], JohnTestLatLon:5,7, _version_:-118489088}] }} As you can see, the 0.0 hangs around in JohnTestLatLon_0_coordinate and JohnTestLatLon_1_coordinate. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4828) BooleanQuery.extractTerms should not recurse into MUST_NOT clauses
[ https://issues.apache.org/jira/browse/LUCENE-4828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603392#comment-13603392 ] Yonik Seeley commented on LUCENE-4828: -- bq. I think ideally we would not weight or score MUST_NOT or constant-scored clauses at all. I know this isnt the case today, but I just think its dumb. Not weighting prohibited clauses would needlessly break certain types of queries. BooleanQuery.extractTerms should not recurse into MUST_NOT clauses -- Key: LUCENE-4828 URL: https://issues.apache.org/jira/browse/LUCENE-4828 Project: Lucene - Core Issue Type: Bug Components: core/search Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, 4.3 Attachments: LUCENE-4828.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4588) Partial Update of Poly Field Corrupts Data
[ https://issues.apache.org/jira/browse/SOLR-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603400#comment-13603400 ] Yonik Seeley commented on SOLR-4588: Partial update isn't currently going to work well with stored copyField targets or stored sub-fields of a polyField. Further, sub-fields of a polyField normally should be stored=false... they are meant more as an implementation detail and not interface for clients. This is the definition in the stock schema (notice stored=false): {code} !-- Type used to index the lat and lon components for the location FieldType -- dynamicField name=*_coordinate type=tdouble indexed=true stored=false / {code} Partial Update of Poly Field Corrupts Data -- Key: SOLR-4588 URL: https://issues.apache.org/jira/browse/SOLR-4588 Project: Solr Issue Type: Bug Affects Versions: 4.0 Reporter: John Crygier Priority: Minor Attachments: schema.xml When updating a field that is a poly type (Testing with LatLonType), when you do a partial document update, the poly fields will become multi-valued. This occurs even when the field is configured to not be multi-valued. Test Case Use the attached schema (schema.xml) And issue the following commands (With responses): curl 'localhost:8983/solr/update?commit=true' -H 'Content-type:application/json' -d '[{id:JohnTestDocument, JohnTestLatLon : 0,0}]' RESPONSE: {responseHeader:{status:0,QTime:2133}} curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true' RESPONSE: { responseHeader:{ status:0, QTime:2, params:{ indent:true, q:*:*, wt:json}}, response:{numFound:1,start:0,docs:[ { id:JohnTestDocument, JohnTestLatLon_0_coordinate:0.0, JohnTestLatLon_1_coordinate:0.0, JohnTestLatLon:0,0, _version_:-1596981248}] }} curl 'localhost:8983/solr/update?commit=true' -H 'Content-type:application/json' -d '[{id:JohnTestDocument,JohnTestLatLon:{set:5,7}}]' RESPONSE: {responseHeader:{status:0,QTime:218}} curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true' RESPONSE: { responseHeader:{ status:0, QTime:2, params:{ indent:true, q:*:*, wt:json}}, response:{numFound:1,start:0,docs:[ { id:JohnTestDocument, JohnTestLatLon_0_coordinate:[0.0, 5.0], JohnTestLatLon_1_coordinate:[0.0, 7.0], JohnTestLatLon:5,7, _version_:-118489088}] }} As you can see, the 0.0 hangs around in JohnTestLatLon_0_coordinate and JohnTestLatLon_1_coordinate. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4828) BooleanQuery.extractTerms should not recurse into MUST_NOT clauses
[ https://issues.apache.org/jira/browse/LUCENE-4828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603402#comment-13603402 ] Robert Muir commented on LUCENE-4828: - What kind of queries would this break? Just to be clear, when I say weight. I mean, similarity. we'd still createWeight, it just wouldnt fetch any term statistics. BooleanQuery.extractTerms should not recurse into MUST_NOT clauses -- Key: LUCENE-4828 URL: https://issues.apache.org/jira/browse/LUCENE-4828 Project: Lucene - Core Issue Type: Bug Components: core/search Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, 4.3 Attachments: LUCENE-4828.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4588) Partial Update of Poly Field Corrupts Data
[ https://issues.apache.org/jira/browse/SOLR-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603414#comment-13603414 ] John Crygier commented on SOLR-4588: Thanks all for the input. I did verify that the behavior is the same in 4.2. I didn't put the full story here that is actually leading to me needing this bug fix. I actually have a custom field that I've written that works with strings. My intention is to use highlighting on the dynamic poly fields, so the user knows when there is a hit to a certain column. I had thought that I read that highlighting only works on stored fields, so that's why I was working with stored fields. It's a minor issue, and since I'm coding custom fields anyway, I should be able to work around it. Thanks again! Partial Update of Poly Field Corrupts Data -- Key: SOLR-4588 URL: https://issues.apache.org/jira/browse/SOLR-4588 Project: Solr Issue Type: Bug Affects Versions: 4.0 Reporter: John Crygier Priority: Minor Attachments: schema.xml When updating a field that is a poly type (Testing with LatLonType), when you do a partial document update, the poly fields will become multi-valued. This occurs even when the field is configured to not be multi-valued. Test Case Use the attached schema (schema.xml) And issue the following commands (With responses): curl 'localhost:8983/solr/update?commit=true' -H 'Content-type:application/json' -d '[{id:JohnTestDocument, JohnTestLatLon : 0,0}]' RESPONSE: {responseHeader:{status:0,QTime:2133}} curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true' RESPONSE: { responseHeader:{ status:0, QTime:2, params:{ indent:true, q:*:*, wt:json}}, response:{numFound:1,start:0,docs:[ { id:JohnTestDocument, JohnTestLatLon_0_coordinate:0.0, JohnTestLatLon_1_coordinate:0.0, JohnTestLatLon:0,0, _version_:-1596981248}] }} curl 'localhost:8983/solr/update?commit=true' -H 'Content-type:application/json' -d '[{id:JohnTestDocument,JohnTestLatLon:{set:5,7}}]' RESPONSE: {responseHeader:{status:0,QTime:218}} curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true' RESPONSE: { responseHeader:{ status:0, QTime:2, params:{ indent:true, q:*:*, wt:json}}, response:{numFound:1,start:0,docs:[ { id:JohnTestDocument, JohnTestLatLon_0_coordinate:[0.0, 5.0], JohnTestLatLon_1_coordinate:[0.0, 7.0], JohnTestLatLon:5,7, _version_:-118489088}] }} As you can see, the 0.0 hangs around in JohnTestLatLon_0_coordinate and JohnTestLatLon_1_coordinate. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4588) Partial Update of Poly Field Corrupts Data
[ https://issues.apache.org/jira/browse/SOLR-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Crygier updated SOLR-4588: --- Affects Version/s: 4.2 Partial Update of Poly Field Corrupts Data -- Key: SOLR-4588 URL: https://issues.apache.org/jira/browse/SOLR-4588 Project: Solr Issue Type: Bug Affects Versions: 4.0, 4.2 Reporter: John Crygier Priority: Minor Attachments: schema.xml When updating a field that is a poly type (Testing with LatLonType), when you do a partial document update, the poly fields will become multi-valued. This occurs even when the field is configured to not be multi-valued. Test Case Use the attached schema (schema.xml) And issue the following commands (With responses): curl 'localhost:8983/solr/update?commit=true' -H 'Content-type:application/json' -d '[{id:JohnTestDocument, JohnTestLatLon : 0,0}]' RESPONSE: {responseHeader:{status:0,QTime:2133}} curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true' RESPONSE: { responseHeader:{ status:0, QTime:2, params:{ indent:true, q:*:*, wt:json}}, response:{numFound:1,start:0,docs:[ { id:JohnTestDocument, JohnTestLatLon_0_coordinate:0.0, JohnTestLatLon_1_coordinate:0.0, JohnTestLatLon:0,0, _version_:-1596981248}] }} curl 'localhost:8983/solr/update?commit=true' -H 'Content-type:application/json' -d '[{id:JohnTestDocument,JohnTestLatLon:{set:5,7}}]' RESPONSE: {responseHeader:{status:0,QTime:218}} curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true' RESPONSE: { responseHeader:{ status:0, QTime:2, params:{ indent:true, q:*:*, wt:json}}, response:{numFound:1,start:0,docs:[ { id:JohnTestDocument, JohnTestLatLon_0_coordinate:[0.0, 5.0], JohnTestLatLon_1_coordinate:[0.0, 7.0], JohnTestLatLon:5,7, _version_:-118489088}] }} As you can see, the 0.0 hangs around in JohnTestLatLon_0_coordinate and JohnTestLatLon_1_coordinate. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-4574) The Collections API will silently return success on an unknown ACTION parameter.
[ https://issues.apache.org/jira/browse/SOLR-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved SOLR-4574. --- Resolution: Fixed The Collections API will silently return success on an unknown ACTION parameter. Key: SOLR-4574 URL: https://issues.apache.org/jira/browse/SOLR-4574 Project: Solr Issue Type: Bug Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.3, 5.0 Attachments: SOLR-4574.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-4576) Collections API validation errors should cause an exception on clients and otherwise act as validation errors with the Core Admin API.
[ https://issues.apache.org/jira/browse/SOLR-4576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved SOLR-4576. --- Resolution: Fixed Collections API validation errors should cause an exception on clients and otherwise act as validation errors with the Core Admin API. -- Key: SOLR-4576 URL: https://issues.apache.org/jira/browse/SOLR-4576 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.3, 5.0 Attachments: SOLR-4576.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-4577) The collections API should return responses (sucess or failure) for each node it attempts to work with.
[ https://issues.apache.org/jira/browse/SOLR-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved SOLR-4577. --- Resolution: Fixed The collections API should return responses (sucess or failure) for each node it attempts to work with. --- Key: SOLR-4577 URL: https://issues.apache.org/jira/browse/SOLR-4577 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.3, 5.0 This is when the command itself is successful on the node, but then we need a report of the sub command result on each node. There is some code that sort of attempts to do this that came in with the collection api response contribution, but it's not really working currently. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4752) Merge segments to sort them
[ https://issues.apache.org/jira/browse/LUCENE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603427#comment-13603427 ] Robert Muir commented on LUCENE-4752: - I disagree... and I guess I'm willing to go to bat for this. There is real cost in exposing stuff like this. I'm already frustrated about the amount of stuff around this area that is 'public' solely due to packaging (e.g. the .codecs package and the .index package both need it). Finally, if we have code in lucene itself that relies upon the inner details because it e.g. subclasses segmentmerger, this makes it harder to refactor core lucene and evolve it in the future because we have modules doing sorting or shuffling or god knows what that rely upon its api. in such a case where i want to refactor SM, i could just eradicate those modules and nobody would complain, right? I don't think we should do it. Merge segments to sort them --- Key: LUCENE-4752 URL: https://issues.apache.org/jira/browse/LUCENE-4752 Project: Lucene - Core Issue Type: New Feature Components: core/index Reporter: David Smiley Assignee: Adrien Grand Attachments: LUCENE-4752.patch It would be awesome if Lucene could write the documents out in a segment based on a configurable order. This of course applies to merging segments to. The benefit is increased locality on disk of documents that are likely to be accessed together. This often applies to documents near each other in time, but also spatially. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4752) Merge segments to sort them
[ https://issues.apache.org/jira/browse/LUCENE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603430#comment-13603430 ] Robert Muir commented on LUCENE-4752: - Another example of a 'super expert api' that i think is ok, is indexingchain, where it stays package private, and the IWConfig setter is also package private. i think something like this might be reasonable. Merge segments to sort them --- Key: LUCENE-4752 URL: https://issues.apache.org/jira/browse/LUCENE-4752 Project: Lucene - Core Issue Type: New Feature Components: core/index Reporter: David Smiley Assignee: Adrien Grand Attachments: LUCENE-4752.patch It would be awesome if Lucene could write the documents out in a segment based on a configurable order. This of course applies to merging segments to. The benefit is increased locality on disk of documents that are likely to be accessed together. This often applies to documents near each other in time, but also spatially. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4585) The Collections API validates numShards with 0 but should use = 0.
[ https://issues.apache.org/jira/browse/SOLR-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603438#comment-13603438 ] Commit Tag Bot commented on SOLR-4585: -- [trunk commit] Mark Robert Miller http://svn.apache.org/viewvc?view=revisionrevision=1456979 SOLR-4585: The Collections API validates numShards with 0 but should use = 0. The Collections API validates numShards with 0 but should use = 0. - Key: SOLR-4585 URL: https://issues.apache.org/jira/browse/SOLR-4585 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.3, 5.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4752) Merge segments to sort them
[ https://issues.apache.org/jira/browse/LUCENE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603440#comment-13603440 ] Robert Muir commented on LUCENE-4752: - And finally i think it would be way better to provide whatever 'hook' is needed for this kinda stuff rather than allow subclassing of segmentmerger. like a proper pluggable api (e.g. codec is an example of this) versus letting people just subclass concrete things. Merge segments to sort them --- Key: LUCENE-4752 URL: https://issues.apache.org/jira/browse/LUCENE-4752 Project: Lucene - Core Issue Type: New Feature Components: core/index Reporter: David Smiley Assignee: Adrien Grand Attachments: LUCENE-4752.patch It would be awesome if Lucene could write the documents out in a segment based on a configurable order. This of course applies to merging segments to. The benefit is increased locality on disk of documents that are likely to be accessed together. This often applies to documents near each other in time, but also spatially. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: questions about solr.xml/solr.properties
Cool - you probably also want to move another entry under that? Usually I've been using Additional Changes: below for this: * SOLR-4543: setting shardHandlerFactory in solr.xml/solr.properties does not work. (Ryan Ernst, Robert Muir via Erick Erickson) That's not a released bug right? If not we don't want it to appear so - we still want to give credit and have the tracking for trunk users I think, that's why I use the Additional Changes for follow on JIRAs to large CHANGES. - Mark On Mar 15, 2013, at 9:08 AM, Erick Erickson erickerick...@gmail.com wrote: Added some bits to CHANGES.txt for SOLR-4196 On Thu, Mar 14, 2013 at 10:01 PM, Erick Erickson erickerick...@gmail.com wrote: OK, I'll see what I can put in tomorrow. It won't be comprehensive, probably just refer to the Wiki page after a very brief explanation. On Thu, Mar 14, 2013 at 9:45 PM, Mark Miller markrmil...@gmail.com wrote: Okay - leaving it out on purpose can get kind of confusing - someone that wanted to look at the state of trunk right now might think, oh, only bug fixes and very minor changes, but surprise, there is actually a major structural change. I think we should try and keep CHANGES up to date with reality for our 'trunk', '4x' users. - Mark On Mar 14, 2013, at 9:24 PM, Erick Erickson erickerick...@gmail.com wrote: bq: Is there any mention of this in CHANGES yet Nope, it's one of the JIRAs I've assigned to myself. SOLR-4542. I have started a Wiki page here: http://wiki.apache.org/solr/Core%20Discovery%20%284.3%20and%20beyond%29 linked to from here: http://wiki.apache.org/solr/CoreAdmin#Configuration But I've been waiting for the dust to settle before fleshing this out much. Although the more exposure it gets, I suppose the more chance people will have to comment on it. If we're agreed that solr.properties is the way to go, then I'll put something in CHANGES Real Soon Now and perhaps let the Wiki page evolve in fits and starts. On Thu, Mar 14, 2013 at 8:43 PM, Mark Miller markrmil...@gmail.com wrote: Is there any mention of this in CHANGES yet erick? Was just browsing for it… - Mark On Mar 14, 2013, at 6:37 PM, Jan Høydahl jan@cominvent.com wrote: solr.yml :-) -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com 14. mars 2013 kl. 22:02 skrev Yonik Seeley yo...@lucidworks.com: On Thu, Mar 14, 2013 at 3:46 PM, Robert Muir rcm...@gmail.com wrote: It seems to me there are two changes involved: 1. ability to auto-discover cores from the filesystem so you don't need to explicitly list them 2. changing .xml format to .properties These are indeed completely independent. My main concern/goal in this area has been #1. I assume #2 is just because developer tastes have been shifting away from XML, but like you I worry about what happens for config that needs more structure. -Yonik http://lucidworks.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #274: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/274/ 1 tests failed. REGRESSION: org.apache.solr.cloud.SyncSliceTest.testDistribSearch Error Message: null Stack Trace: java.lang.AssertionError: null at __randomizedtesting.SeedInfo.seed([6894893B8E7703A5:E9720723F9286399]:0) at org.junit.Assert.fail(Assert.java:92) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertTrue(Assert.java:54) at org.apache.solr.cloud.SyncSliceTest.doTest(SyncSliceTest.java:196) Build Log: [...truncated 23171 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4585) The Collections API validates numShards with 0 but should use = 0.
[ https://issues.apache.org/jira/browse/SOLR-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603447#comment-13603447 ] Commit Tag Bot commented on SOLR-4585: -- [branch_4x commit] Mark Robert Miller http://svn.apache.org/viewvc?view=revisionrevision=1456981 SOLR-4585: The Collections API validates numShards with 0 but should use = 0. The Collections API validates numShards with 0 but should use = 0. - Key: SOLR-4585 URL: https://issues.apache.org/jira/browse/SOLR-4585 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.3, 5.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: questions about solr.xml/solr.properties
On Fri, Mar 15, 2013 at 11:12 AM, Mark Miller markrmil...@gmail.com wrote: Cool - you probably also want to move another entry under that? Usually I've been using Additional Changes: below for this: * SOLR-4543: setting shardHandlerFactory in solr.xml/solr.properties does not work. (Ryan Ernst, Robert Muir via Erick Erickson) That's not a released bug right? If not we don't want it to appear so - we still want to give credit and have the tracking for trunk users I think, that's why I use the Additional Changes for follow on JIRAs to large CHANGES. Its a released bug: it looks like it never worked from solr.xml (see 4.2 src code CoreContainer:707) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: questions about solr.xml/solr.properties
On Mar 15, 2013, at 11:17 AM, Robert Muir rcm...@gmail.com wrote: Its a released bug: it looks like it never worked from solr.xml (see 4.2 src code CoreContainer:707) Ah okay - it would ignore the shard handler class before and just accept settings. - Mark - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4318) NullPointerException encountered when /select query on solr.TextField.
[ https://issues.apache.org/jira/browse/SOLR-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603461#comment-13603461 ] Erick Erickson commented on SOLR-4318: -- Actually, on a second look I think the original patch is the right thing to do. There is actually a default tokenizer assigned to a TextField, admittedly it is rudimentary, but it's still defined. A note in the docs would be good though. NullPointerException encountered when /select query on solr.TextField. -- Key: SOLR-4318 URL: https://issues.apache.org/jira/browse/SOLR-4318 Project: Solr Issue Type: Bug Components: Build Affects Versions: 4.0 Reporter: Junaid Surve Assignee: Erick Erickson Labels: query, select Attachments: SOLR-4318.patch I have two fields, one is title and the other is description in my Solr schema like - Type - fieldType name=text class=solr.TextField positionIncrementGap=100/ Declaration - field name=description type=text indexed=true stored=true/ without any tokenizer or filter. On querying /select?q=description:myText it works. However when I add a '*' it fails. Failure scenario - /select?q=description:* /select?q=description:myText* .. etc solrconfig.xml - requestHandler name=/select class=solr.SearchHandler lst name=defaults str name=echoParamsexplicit/str int name=rows10/int str name=dftitle/str /lst /requestHandler -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-4318) NullPointerException encountered when /select query on solr.TextField.
[ https://issues.apache.org/jira/browse/SOLR-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603461#comment-13603461 ] Erick Erickson edited comment on SOLR-4318 at 3/15/13 3:50 PM: --- Actually, on a second look I think the original patch is the right thing to do. There is actually a default tokenizer assigned to a TextField, admittedly it is rudimentary, but it's still defined. So my original statement was entirely wrong, a TextField type with no analysis chain is perfectly correct, it was entirely a problem with the MultiTermAware code. Fixing it with the original patch is fine. I'll add a test too. was (Author: erickerickson): Actually, on a second look I think the original patch is the right thing to do. There is actually a default tokenizer assigned to a TextField, admittedly it is rudimentary, but it's still defined. A note in the docs would be good though. NullPointerException encountered when /select query on solr.TextField. -- Key: SOLR-4318 URL: https://issues.apache.org/jira/browse/SOLR-4318 Project: Solr Issue Type: Bug Components: Build Affects Versions: 4.0 Reporter: Junaid Surve Assignee: Erick Erickson Labels: query, select Attachments: SOLR-4318.patch I have two fields, one is title and the other is description in my Solr schema like - Type - fieldType name=text class=solr.TextField positionIncrementGap=100/ Declaration - field name=description type=text indexed=true stored=true/ without any tokenizer or filter. On querying /select?q=description:myText it works. However when I add a '*' it fails. Failure scenario - /select?q=description:* /select?q=description:myText* .. etc solrconfig.xml - requestHandler name=/select class=solr.SearchHandler lst name=defaults str name=echoParamsexplicit/str int name=rows10/int str name=dftitle/str /lst /requestHandler -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Solr Log4j?
Strongly in favor here! I'm currently repackaging the war to use log4j bindings and would love to avoid that step. Really it seems like this should be an option in the build (since it must be in the war, or at least if slf4j is in the war, then so must the binding). Like if there were an ant property you could set with -D that says which binding you want? On Fri, Mar 15, 2013 at 9:38 AM, Ryan McKinley ryan...@gmail.com wrote: We have discussed a few times shipping with or supporting log4j rather then defaulting to JUL I just updated SOLR-3706 to do this. What are peoples thoughts on this issue now? Thanks Ryan
Re: Solr Log4j?
Really it seems like this should be an option in the build Note that it already is, and has been for a while. There is an option to build a .war with no logging bindings included: ant dist-war-excl-slf4j
[jira] [Commented] (SOLR-3706) Ship setup to log with log4j.
[ https://issues.apache.org/jira/browse/SOLR-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603503#comment-13603503 ] Mark Miller commented on SOLR-3706: --- I'd prefer a solution closer to what we are discussing. There are also other things to look at I think: *existing log ant targets - are they now broken? *existing pre logging setup in jetty.xml and stuff in the README *an example log4j conf file rather than the java util one *consider switching our tests to it so that devs actually deal with what users will see I think that Jan is on the right track for how we should tackle the switch. Ship setup to log with log4j. - Key: SOLR-3706 URL: https://issues.apache.org/jira/browse/SOLR-3706 Project: Solr Issue Type: Improvement Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.3, 5.0 Attachments: SOLR-3706-solr-log4j.patch Currently we default to java util logging and it's terrible in my opinion. *It's simple built in logger is a 2 line logger. *You have to jump through hoops to use your own custom formatter with jetty - either putting your class in the start.jar or other pain in the butt solutions. *It can't roll files by date out of the box. I'm sure there are more issues, but those are the ones annoying me now. We should switch to log4j - it's much nicer and it's easy to get a nice single line format and roll by date, etc. If someone wants to use JUL they still can - but at least users could start with something decent. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3706) Ship setup to log with log4j.
[ https://issues.apache.org/jira/browse/SOLR-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603512#comment-13603512 ] Ryan McKinley commented on SOLR-3706: - Does that mean: - no logging slf4j/log4j in .war (like dist-war-excl-slf4j) - put logging files slf4j+log4j in example/lib I'm a big +1 to that but last we discussed inertia pointed towords keeping concrete logging dependencies in solr.war Ship setup to log with log4j. - Key: SOLR-3706 URL: https://issues.apache.org/jira/browse/SOLR-3706 Project: Solr Issue Type: Improvement Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.3, 5.0 Attachments: SOLR-3706-solr-log4j.patch Currently we default to java util logging and it's terrible in my opinion. *It's simple built in logger is a 2 line logger. *You have to jump through hoops to use your own custom formatter with jetty - either putting your class in the start.jar or other pain in the butt solutions. *It can't roll files by date out of the box. I'm sure there are more issues, but those are the ones annoying me now. We should switch to log4j - it's much nicer and it's easy to get a nice single line format and roll by date, etc. If someone wants to use JUL they still can - but at least users could start with something decent. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3706) Ship setup to log with log4j.
[ https://issues.apache.org/jira/browse/SOLR-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603516#comment-13603516 ] Christian Moen commented on SOLR-3706: -- {quote} Mark, have you tried Logback? That's a good logging implementation; arguably a better one. {quote} David and Mark, I believe [Log4J 2|http://logging.apache.org/log4j/2.x/|] addresses a lot of the weaknesses in Log4J 1.x also addressed by Logback. However, Log4J 2 hasn't been released yet. To me it sounds like a good idea to use Log4J 1.x now and move to Log4J 2 in the future. Ship setup to log with log4j. - Key: SOLR-3706 URL: https://issues.apache.org/jira/browse/SOLR-3706 Project: Solr Issue Type: Improvement Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.3, 5.0 Attachments: SOLR-3706-solr-log4j.patch Currently we default to java util logging and it's terrible in my opinion. *It's simple built in logger is a 2 line logger. *You have to jump through hoops to use your own custom formatter with jetty - either putting your class in the start.jar or other pain in the butt solutions. *It can't roll files by date out of the box. I'm sure there are more issues, but those are the ones annoying me now. We should switch to log4j - it's much nicer and it's easy to get a nice single line format and roll by date, etc. If someone wants to use JUL they still can - but at least users could start with something decent. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3706) Ship setup to log with log4j.
[ https://issues.apache.org/jira/browse/SOLR-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603517#comment-13603517 ] Mark Miller commented on SOLR-3706: --- bq. but last we discussed inertia pointed towords keeping concrete logging dependencies in solr.war Why? I don't see anyone arguing for that here. There seems to be plenty of advantageous to getting it out of the webapp and plenty of downsides to having it in. Where and what is someone else arguing? Ship setup to log with log4j. - Key: SOLR-3706 URL: https://issues.apache.org/jira/browse/SOLR-3706 Project: Solr Issue Type: Improvement Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.3, 5.0 Attachments: SOLR-3706-solr-log4j.patch Currently we default to java util logging and it's terrible in my opinion. *It's simple built in logger is a 2 line logger. *You have to jump through hoops to use your own custom formatter with jetty - either putting your class in the start.jar or other pain in the butt solutions. *It can't roll files by date out of the box. I'm sure there are more issues, but those are the ones annoying me now. We should switch to log4j - it's much nicer and it's easy to get a nice single line format and roll by date, etc. If someone wants to use JUL they still can - but at least users could start with something decent. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-4561) CachedSqlEntityProcessor with parametarized query is broken
[ https://issues.apache.org/jira/browse/SOLR-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Dyer resolved SOLR-4561. -- Resolution: Duplicate This is the same as SOLR-3857. CachedSqlEntityProcessor with parametarized query is broken --- Key: SOLR-4561 URL: https://issues.apache.org/jira/browse/SOLR-4561 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Affects Versions: 4.1 Reporter: Sudheer Prem Original Estimate: 1m Remaining Estimate: 1m When child entities are created and the child entity is provided with a parametrized query as below, {code:xml} entity name=x query=select * from x entity name=y query=select * from y where xid=${x.id} processor=CachedSqlEntityProcessor /entity entity {code} the Entity Processor always return the result from the fist query even though the parameter is changed, It is happening because, EntityProcessorBase.getNext() method doesn't reset the query and rowIterator after calling DIHCacheSupport.getCacheData() method. This can be fixed by changing the else block in getNext() method of EntityProcessorBase from {code} else { return cacheSupport.getCacheData(context, query, rowIterator); } {code} to the code mentioned below: {code} else { MapString,Object cacheData = cacheSupport.getCacheData(context, query, rowIterator); query = null; rowIterator = null; return cacheData; } {code} Update: But then, the caching doesn't seem to be working... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3706) Ship setup to log with log4j.
[ https://issues.apache.org/jira/browse/SOLR-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603532#comment-13603532 ] Ryan McKinley commented on SOLR-3706: - Where and what is someone else arguing? I am remembering discussion from a long time ago (year+) the last time I really paid attention to this discussion. I can dig it up, but am much happier to drop it from the .war! I will switch things around to pull the logging jars out of the war and put them in the example lib folder Are there concrete patches on other issues that I ignored? The key stuff in this one is for the admin UI managing log4j levels. Ship setup to log with log4j. - Key: SOLR-3706 URL: https://issues.apache.org/jira/browse/SOLR-3706 Project: Solr Issue Type: Improvement Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.3, 5.0 Attachments: SOLR-3706-solr-log4j.patch Currently we default to java util logging and it's terrible in my opinion. *It's simple built in logger is a 2 line logger. *You have to jump through hoops to use your own custom formatter with jetty - either putting your class in the start.jar or other pain in the butt solutions. *It can't roll files by date out of the box. I'm sure there are more issues, but those are the ones annoying me now. We should switch to log4j - it's much nicer and it's easy to get a nice single line format and roll by date, etc. If someone wants to use JUL they still can - but at least users could start with something decent. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3706) Ship setup to log with log4j.
[ https://issues.apache.org/jira/browse/SOLR-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603537#comment-13603537 ] Mark Miller commented on SOLR-3706: --- bq. I can dig it up, but am much happier to drop it from the .war! Great, this is my feeling - a lot has changed in a year - we have a new discussion here that seems to be making progress and has a lot of visibility. If someone wants to toss a monkey wrench in, the lane is wide open :) Ship setup to log with log4j. - Key: SOLR-3706 URL: https://issues.apache.org/jira/browse/SOLR-3706 Project: Solr Issue Type: Improvement Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.3, 5.0 Attachments: SOLR-3706-solr-log4j.patch Currently we default to java util logging and it's terrible in my opinion. *It's simple built in logger is a 2 line logger. *You have to jump through hoops to use your own custom formatter with jetty - either putting your class in the start.jar or other pain in the butt solutions. *It can't roll files by date out of the box. I'm sure there are more issues, but those are the ones annoying me now. We should switch to log4j - it's much nicer and it's easy to get a nice single line format and roll by date, etc. If someone wants to use JUL they still can - but at least users could start with something decent. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3706) Ship setup to log with log4j.
[ https://issues.apache.org/jira/browse/SOLR-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603539#comment-13603539 ] Mark Miller commented on SOLR-3706: --- bq. I will switch things around to pull the logging jars out of the war and put them in the example lib folder I'm happy to help out too - I've been meaning to get to this and just have not yet. Happy to let someone else do it, but if you need any help, I have a strong interest in making this work well. Ship setup to log with log4j. - Key: SOLR-3706 URL: https://issues.apache.org/jira/browse/SOLR-3706 Project: Solr Issue Type: Improvement Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.3, 5.0 Attachments: SOLR-3706-solr-log4j.patch Currently we default to java util logging and it's terrible in my opinion. *It's simple built in logger is a 2 line logger. *You have to jump through hoops to use your own custom formatter with jetty - either putting your class in the start.jar or other pain in the butt solutions. *It can't roll files by date out of the box. I'm sure there are more issues, but those are the ones annoying me now. We should switch to log4j - it's much nicer and it's easy to get a nice single line format and roll by date, etc. If someone wants to use JUL they still can - but at least users could start with something decent. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-4585) The Collections API validates numShards with 0 but should use = 0.
[ https://issues.apache.org/jira/browse/SOLR-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved SOLR-4585. --- Resolution: Fixed The Collections API validates numShards with 0 but should use = 0. - Key: SOLR-4585 URL: https://issues.apache.org/jira/browse/SOLR-4585 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.3, 5.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4478) Allow cores to specify a named config set
[ https://issues.apache.org/jira/browse/SOLR-4478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603565#comment-13603565 ] Erick Erickson commented on SOLR-4478: -- Starting on this finally, couple of points for discussion: What do we do with each of these if we find a configSet entry in the core.properties file? instanceDir - nothing to do here except we don't look here for configuration files dataDir - again, nothing. The meaning remains unchanged. config - check that it exists in the config set and blow up if we don't find it. schema - treat as config. for config and schema, it hurts my head to think of resolving relative paths, absolute paths, the relationship to solr_home, the relationship of referenced files ( stopwords, etc). At least for the first cut I want to allow the config and schema files to be a different name, but that's it. And require that they live in the configSet directory. Unless all of this just automagically happens through the resource loader. The properties entry in the core.properties file (doesn't depend on configSet) - does it make sense to have it any more at all? I propose we deprecate it. Is there a convenient place in the SolrCloud code that I can rip off? I'll look but I don't want to re-invent the wheel if I miss it Allow cores to specify a named config set - Key: SOLR-4478 URL: https://issues.apache.org/jira/browse/SOLR-4478 Project: Solr Issue Type: Improvement Affects Versions: 4.2, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Part of moving forward to the new way, after SOLR-4196 etc... I propose an additional parameter specified on the core node in solr.xml or as a parameter in the discovery mode core.properties file, call it configSet, where the value provided is a path to a directory, either absolute or relative. Really, this is as though you copied the conf directory somewhere to be used by more than one core. Straw-man: There will be a directory solr_home/configsets which will be the default. If the configSet parameter is, say, myconf, then I'd expect a directory named myconf to exist in solr_home/configsets, which would look something like solr_home/configsets/myconf/schema.xml solrconfig.xml stopwords.txt velocity velocity/query.vm etc. If multiple cores used the same configSet, schema, solrconfig etc. would all be shared (i.e. shareSchema=true would be assumed). I don't see a good use-case for _not_ sharing schemas, so I don't propose to allow this to be turned off. Hmmm, what if shareSchema is explicitly set to false in the solr.xml or properties file? I'd guess it should be honored but maybe log a warning? Mostly I'm putting this up for comments. I know that there are already thoughts about how this all should work floating around, so before I start any work on this I thought I'd at least get an idea of whether this is the way people are thinking about going. Configset can be either a relative or absolute path, if relative it's assumed to be relative to solr_home. Thoughts? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4318) NullPointerException encountered when /select query on solr.TextField.
[ https://issues.apache.org/jira/browse/SOLR-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603578#comment-13603578 ] Commit Tag Bot commented on SOLR-4318: -- [trunk commit] Erick Erickson http://svn.apache.org/viewvc?view=revisionrevision=1457032 SOLR-4318 NPE when doing a wildcard query on a TextField with the default analysis chain NullPointerException encountered when /select query on solr.TextField. -- Key: SOLR-4318 URL: https://issues.apache.org/jira/browse/SOLR-4318 Project: Solr Issue Type: Bug Components: Build Affects Versions: 4.0 Reporter: Junaid Surve Assignee: Erick Erickson Labels: query, select Attachments: SOLR-4318.patch I have two fields, one is title and the other is description in my Solr schema like - Type - fieldType name=text class=solr.TextField positionIncrementGap=100/ Declaration - field name=description type=text indexed=true stored=true/ without any tokenizer or filter. On querying /select?q=description:myText it works. However when I add a '*' it fails. Failure scenario - /select?q=description:* /select?q=description:myText* .. etc solrconfig.xml - requestHandler name=/select class=solr.SearchHandler lst name=defaults str name=echoParamsexplicit/str int name=rows10/int str name=dftitle/str /lst /requestHandler -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Solr Log4j?
How about logback rather than log4j? http://logback.qos.ch/reasonsToSwitch.html Erik On Mar 15, 2013, at 12:38 , Ryan McKinley wrote: We have discussed a few times shipping with or supporting log4j rather then defaulting to JUL I just updated SOLR-3706 to do this. What are peoples thoughts on this issue now? Thanks Ryan - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-4569) waitForReplicasToComeUp should bail right away if it doesn't see the expected slice in the clusterstate rather than waiting.
[ https://issues.apache.org/jira/browse/SOLR-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved SOLR-4569. --- Resolution: Fixed waitForReplicasToComeUp should bail right away if it doesn't see the expected slice in the clusterstate rather than waiting. Key: SOLR-4569 URL: https://issues.apache.org/jira/browse/SOLR-4569 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.3, 5.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-4570) Even if an explicit shard id is used, ZkController#preRegister should still wait to see the shard id in it's current ClusterState.
[ https://issues.apache.org/jira/browse/SOLR-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved SOLR-4570. --- Resolution: Fixed Even if an explicit shard id is used, ZkController#preRegister should still wait to see the shard id in it's current ClusterState. -- Key: SOLR-4570 URL: https://issues.apache.org/jira/browse/SOLR-4570 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.3, 5.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-4571) SolrZkClient#setData should return Stat object.
[ https://issues.apache.org/jira/browse/SOLR-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved SOLR-4571. --- Resolution: Fixed SolrZkClient#setData should return Stat object. --- Key: SOLR-4571 URL: https://issues.apache.org/jira/browse/SOLR-4571 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.3, 5.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-4568) The lastPublished state check before becoming a leader is not working.
[ https://issues.apache.org/jira/browse/SOLR-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved SOLR-4568. --- Resolution: Fixed The lastPublished state check before becoming a leader is not working. -- Key: SOLR-4568 URL: https://issues.apache.org/jira/browse/SOLR-4568 Project: Solr Issue Type: Bug Affects Versions: 4.2 Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.3, 5.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-4578) CoreAdminHandler#handleCreateAction gets a SolrCore and does not close it in SolrCloud mode when a core with the same name already exists.
[ https://issues.apache.org/jira/browse/SOLR-4578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved SOLR-4578. --- Resolution: Fixed CoreAdminHandler#handleCreateAction gets a SolrCore and does not close it in SolrCloud mode when a core with the same name already exists. -- Key: SOLR-4578 URL: https://issues.apache.org/jira/browse/SOLR-4578 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.3, 5.0 {noformat} if (coreContainer.getZkController() != null) { if (coreContainer.getCore(name) != null) { log.info(Re-creating a core with existing name is not allowed in cloud mode); throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, Core with name ' + name + ' already exists.); } } {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4073) Overseer will miss operations in some cases for OverseerCollectionProcessor
[ https://issues.apache.org/jira/browse/SOLR-4073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-4073: -- Fix Version/s: 5.0 Affects Version/s: 4.1 4.2 Overseer will miss operations in some cases for OverseerCollectionProcessor Key: SOLR-4073 URL: https://issues.apache.org/jira/browse/SOLR-4073 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0, 4.1, 4.2 Environment: Solr cloud Reporter: Raintung Li Assignee: Mark Miller Fix For: 4.3, 5.0 Attachments: patch-4073 Original Estimate: 168h Remaining Estimate: 168h One overseer disconnect with Zookeeper, but overseer thread still handle the request(A) in the DistributedQueue. Example: overseer thread reconnect Zookeeper try to remove the Top's request. workQueue.remove();. Now the other server will take over the overseer privilege because old overseer disconnect. Start overseer thread and handle the queue request(A) again, and remove the request(A) from queue, then try to get the top's request(B, doesn't get). In the this time old overseer reconnect with ZooKeeper, and remove the top's request from queue. Now the top request is B, it is moved by old overseer server. New overseer server never do B request,because this request deleted by old overseer server, at the last this request(B) miss operations. At best, distributeQueue.peek can get the request's ID that will be removed for workqueue.remove(ID), not remove the top's request. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3706) Ship setup to log with log4j.
[ https://issues.apache.org/jira/browse/SOLR-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McKinley updated SOLR-3706: Attachment: SOLR-3706-solr-log4j.patch Mark, can you take a look at this? removes all logging from solr.war and *tries* to copy files to example/lib for some reason the log4j dependency does not copy into the that folder Ship setup to log with log4j. - Key: SOLR-3706 URL: https://issues.apache.org/jira/browse/SOLR-3706 Project: Solr Issue Type: Improvement Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.3, 5.0 Attachments: SOLR-3706-solr-log4j.patch, SOLR-3706-solr-log4j.patch Currently we default to java util logging and it's terrible in my opinion. *It's simple built in logger is a 2 line logger. *You have to jump through hoops to use your own custom formatter with jetty - either putting your class in the start.jar or other pain in the butt solutions. *It can't roll files by date out of the box. I'm sure there are more issues, but those are the ones annoying me now. We should switch to log4j - it's much nicer and it's easy to get a nice single line format and roll by date, etc. If someone wants to use JUL they still can - but at least users could start with something decent. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Solr Log4j?
logback is great too I'm happy with any of them -- assuming we stick to SLF4j bindings On Fri, Mar 15, 2013 at 10:50 AM, Erik Hatcher erik.hatc...@gmail.comwrote: How about logback rather than log4j? http://logback.qos.ch/reasonsToSwitch.html Erik On Mar 15, 2013, at 12:38 , Ryan McKinley wrote: We have discussed a few times shipping with or supporting log4j rather then defaulting to JUL I just updated SOLR-3706 to do this. What are peoples thoughts on this issue now? Thanks Ryan - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Solr Log4j?
On Fri, Mar 15, 2013 at 1:50 PM, Erik Hatcher erik.hatc...@gmail.com wrote: How about logback rather than log4j? http://logback.qos.ch/reasonsToSwitch.html +1, looks promising. -Yonik http://lucidworks.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4589) 4.x + enableLazyFieldLoading + large nultivalued fields + varying fl = pathalogical CPU load response time
Hoss Man created SOLR-4589: -- Summary: 4.x + enableLazyFieldLoading + large nultivalued fields + varying fl = pathalogical CPU load response time Key: SOLR-4589 URL: https://issues.apache.org/jira/browse/SOLR-4589 Project: Solr Issue Type: Bug Affects Versions: 4.2, 4.1, 4.0 Reporter: Hoss Man Following up on a [user report of exterme CPU usage in 4.1|http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201302.mbox/%3c1362019882934-4043543.p...@n3.nabble.com%3E], I've discovered that the following combination of factors can result in extreme CPU usage and excessively HTTP response times... * Solr 4.x (tested 3.6.1, 4.0.0, and 4.2.0) * enableLazyFieldLoading == true (included in example solrconfig.xml) * documents with a large number of values in multivalued fields (eg: tested ~10-15K values) * multiple requests returning the same doc with different fl lists I haven't dug into the route cause yet, but the essential observations is: if lazyloading is used in 4.x, then once a document has been fetched with an initial fl list X, subsequent requests for that document using a differnet fl list Y can be many orders of magnitute slower (while pegging the CPU) -- even if those same requests using fl Y uncached (or w/o lazy laoding) would be extremely fast. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4589) 4.x + enableLazyFieldLoading + large nultivalued fields + varying fl = pathalogical CPU load response time
[ https://issues.apache.org/jira/browse/SOLR-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-4589: --- Attachment: test-just-queries.sh test-just-queries.out__4.0.0_mmap_lazy_using36index.txt test.sh test.out__4.2.0_nio_nolazy.txt test.out__4.2.0_nio_lazy.txt test.out__4.2.0_mmap_nolazy.txt test.out__4.2.0_mmap_lazy.txt test.out__4.0.0_nio_nolazy.txt test.out__4.0.0_nio_lazy.txt test.out__4.0.0_mmap_nolazy.txt test.out__4.0.0_mmap_lazy.txt test.out__3.6.1_nio_nolazy.txt test.out__3.6.1_nio_lazy.txt test.out__3.6.1_mmap_nolazy.txt test.out__3.6.1_mmap_lazy.txt The attached files include a test.sh script that: * creates some data where fields have a large number of values * loads the data into solr * execs 2 queries for a single doc using two different fl options * triggers a commit to flush caches * execs the same two queries in a differnet order Also attached are the raw results of running this script on my Thinkpad T430s against the example jetty solr configs where the version of solr, lazyfield loading, and the directory impl were varried... * version of solr ** 3.6.1 ** 4.0.0 ** 4.2.0 * lazy field loading: ** lazy: default example configs ** nolazy: perl -i -pe 's{enableLazyFieldLoadingtrue}{enableLazyFieldLoadingfalse}' solrconfig.xml * directory impl: ** mmap: java -Dsolr.directoryFactory=solr.MMapDirectoryFactory -jar start.jar ** nio: java -Dsolr.directoryFactory=solr.NIOFSDirectoryFactory -jar start.jar There was no apparent difference in the directory impl choosen, or between 4.0 and 4.2. Here's the summary results for 3.6 vs 4.0 using mmap... || step || 3.6 nolazy || 3.6 lazy || 4.0 nolazy || 4.0 lazy || | small fl | 0m0.308s | 0m0.998s | 0m0.260s | 0m0.202s | | big fl | 0m0.178s | 0m0.263s | 0m0.084s | *16m15.735s* | | commit | XXX | XXX | XXX | XXX | | big fl | 0m0.157s | 0m0.118s | 0m0.218s | 0m0.133s | | small fl | 0m0.036s | 0m0.035s | 0m0.049s | *3m2.814s* | Also attached is also the results of a single test I did running Solr 4.0 pointed at the configs index built with 3.6.1 to rule out codec changes: it behaved essentially the same as the 4.0 tests that built the index from scratch. 4.x + enableLazyFieldLoading + large nultivalued fields + varying fl = pathalogical CPU load response time Key: SOLR-4589 URL: https://issues.apache.org/jira/browse/SOLR-4589 Project: Solr Issue Type: Bug Affects Versions: 4.0, 4.1, 4.2 Reporter: Hoss Man Attachments: test-just-queries.out__4.0.0_mmap_lazy_using36index.txt, test-just-queries.sh, test.out__3.6.1_mmap_lazy.txt, test.out__3.6.1_mmap_nolazy.txt, test.out__3.6.1_nio_lazy.txt, test.out__3.6.1_nio_nolazy.txt, test.out__4.0.0_mmap_lazy.txt, test.out__4.0.0_mmap_nolazy.txt, test.out__4.0.0_nio_lazy.txt, test.out__4.0.0_nio_nolazy.txt, test.out__4.2.0_mmap_lazy.txt, test.out__4.2.0_mmap_nolazy.txt, test.out__4.2.0_nio_lazy.txt, test.out__4.2.0_nio_nolazy.txt, test.sh Following up on a [user report of exterme CPU usage in 4.1|http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201302.mbox/%3c1362019882934-4043543.p...@n3.nabble.com%3E], I've discovered that the following combination of factors can result in extreme CPU usage and excessively HTTP response times... * Solr 4.x (tested 3.6.1, 4.0.0, and 4.2.0) * enableLazyFieldLoading == true (included in example solrconfig.xml) * documents with a large number of values in multivalued fields (eg: tested ~10-15K values) * multiple requests returning the same doc with different fl lists I haven't dug into the route cause yet, but the essential observations is: if lazyloading is used in 4.x, then once a document has been fetched with an initial fl list X, subsequent requests for that document using a differnet fl list Y can be many orders of magnitute slower (while pegging the CPU) -- even if those same requests using fl Y uncached (or w/o lazy laoding) would be extremely fast. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4589) 4.x + enableLazyFieldLoading + large nultivalued fields + varying fl = pathalogical CPU load response time
[ https://issues.apache.org/jira/browse/SOLR-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603649#comment-13603649 ] Yonik Seeley commented on SOLR-4589: I wonder if this could be related to index compression (and maybe the same block being repeatedly decompressed for each lazy field being accessed?) 4.x + enableLazyFieldLoading + large nultivalued fields + varying fl = pathalogical CPU load response time Key: SOLR-4589 URL: https://issues.apache.org/jira/browse/SOLR-4589 Project: Solr Issue Type: Bug Affects Versions: 4.0, 4.1, 4.2 Reporter: Hoss Man Attachments: test-just-queries.out__4.0.0_mmap_lazy_using36index.txt, test-just-queries.sh, test.out__3.6.1_mmap_lazy.txt, test.out__3.6.1_mmap_nolazy.txt, test.out__3.6.1_nio_lazy.txt, test.out__3.6.1_nio_nolazy.txt, test.out__4.0.0_mmap_lazy.txt, test.out__4.0.0_mmap_nolazy.txt, test.out__4.0.0_nio_lazy.txt, test.out__4.0.0_nio_nolazy.txt, test.out__4.2.0_mmap_lazy.txt, test.out__4.2.0_mmap_nolazy.txt, test.out__4.2.0_nio_lazy.txt, test.out__4.2.0_nio_nolazy.txt, test.sh Following up on a [user report of exterme CPU usage in 4.1|http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201302.mbox/%3c1362019882934-4043543.p...@n3.nabble.com%3E], I've discovered that the following combination of factors can result in extreme CPU usage and excessively HTTP response times... * Solr 4.x (tested 3.6.1, 4.0.0, and 4.2.0) * enableLazyFieldLoading == true (included in example solrconfig.xml) * documents with a large number of values in multivalued fields (eg: tested ~10-15K values) * multiple requests returning the same doc with different fl lists I haven't dug into the route cause yet, but the essential observations is: if lazyloading is used in 4.x, then once a document has been fetched with an initial fl list X, subsequent requests for that document using a differnet fl list Y can be many orders of magnitute slower (while pegging the CPU) -- even if those same requests using fl Y uncached (or w/o lazy laoding) would be extremely fast. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Solr Log4j?
Looking at SOLR-3706, there are really two issues that need consensus: 1. The default logging framework shipped in the examples 2. Removing an explicit logging framework from solr.war I am happy with either log4j or logback for #1 On Fri, Mar 15, 2013 at 11:24 AM, Yonik Seeley yo...@lucidworks.com wrote: On Fri, Mar 15, 2013 at 1:50 PM, Erik Hatcher erik.hatc...@gmail.com wrote: How about logback rather than log4j? http://logback.qos.ch/reasonsToSwitch.html +1, looks promising. -Yonik http://lucidworks.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4589) 4.x + enableLazyFieldLoading + large nultivalued fields + varying fl = pathalogical CPU load response time
[ https://issues.apache.org/jira/browse/SOLR-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603683#comment-13603683 ] Uwe Schindler commented on SOLR-4589: - bq. I wonder if this could be related to index compression (and maybe the same block being repeatedly decompressed for each lazy field being accessed?) This also happens in Solr 4.0, which had no compression. The reason here might be the changes in stored fields altogether. Lucene natively no longer has support for lazy field loading, but there is a backwards layer just for Solr in modules/misc (LazyDocument.java). The document does not use maps to lookup, if you have many fields its always a scan through the ArrayList of all fields in the document. The lazyness in LazyDocument is only that the *whole* document is loaded delayed, but no longer single fields. 4.x + enableLazyFieldLoading + large nultivalued fields + varying fl = pathalogical CPU load response time Key: SOLR-4589 URL: https://issues.apache.org/jira/browse/SOLR-4589 Project: Solr Issue Type: Bug Affects Versions: 4.0, 4.1, 4.2 Reporter: Hoss Man Attachments: test-just-queries.out__4.0.0_mmap_lazy_using36index.txt, test-just-queries.sh, test.out__3.6.1_mmap_lazy.txt, test.out__3.6.1_mmap_nolazy.txt, test.out__3.6.1_nio_lazy.txt, test.out__3.6.1_nio_nolazy.txt, test.out__4.0.0_mmap_lazy.txt, test.out__4.0.0_mmap_nolazy.txt, test.out__4.0.0_nio_lazy.txt, test.out__4.0.0_nio_nolazy.txt, test.out__4.2.0_mmap_lazy.txt, test.out__4.2.0_mmap_nolazy.txt, test.out__4.2.0_nio_lazy.txt, test.out__4.2.0_nio_nolazy.txt, test.sh Following up on a [user report of exterme CPU usage in 4.1|http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201302.mbox/%3c1362019882934-4043543.p...@n3.nabble.com%3E], I've discovered that the following combination of factors can result in extreme CPU usage and excessively HTTP response times... * Solr 4.x (tested 3.6.1, 4.0.0, and 4.2.0) * enableLazyFieldLoading == true (included in example solrconfig.xml) * documents with a large number of values in multivalued fields (eg: tested ~10-15K values) * multiple requests returning the same doc with different fl lists I haven't dug into the route cause yet, but the essential observations is: if lazyloading is used in 4.x, then once a document has been fetched with an initial fl list X, subsequent requests for that document using a differnet fl list Y can be many orders of magnitute slower (while pegging the CPU) -- even if those same requests using fl Y uncached (or w/o lazy laoding) would be extremely fast. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4586) Remove maxBooleanClauses from Solr
[ https://issues.apache.org/jira/browse/SOLR-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603725#comment-13603725 ] Shawn Heisey commented on SOLR-4586: Mikhail, I was just going by what a committer told me in IRC. If that's wrong, then the patch shouldn't be applied and this issue can be closed. I tried the patched Solr out after removed maxBooleanClauses from my config, and a 1500-clause query fails, saying too many clauses. Dropping that to 1024 allows the query to complete. There were no results found, but it parsed and said numFound=0. If the information about Lucene no longer having such a limitation is correct, perhaps Solr's code needs updating? Remove maxBooleanClauses from Solr -- Key: SOLR-4586 URL: https://issues.apache.org/jira/browse/SOLR-4586 Project: Solr Issue Type: Improvement Affects Versions: 4.2 Reporter: Shawn Heisey Attachments: SOLR-4586.patch In the #solr IRC channel, I mentioned the maxBooleanClauses limitation to someone asking a question about queries. Mark Miller told me that maxBooleanClauses no longer applies, that the limitation was removed from Lucene sometime in the 3.x series. The config still shows up in the example even in the just-released 4.2. Checking through the source code, I found that the config option is parsed and the value stored in objects, but does not actually seem to be used by anything. I removed every trace of it that I could find, and all tests still pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4707) Track file reference kept by readers that are opened through the writer
[ https://issues.apache.org/jira/browse/LUCENE-4707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603737#comment-13603737 ] Jessica Cheng commented on LUCENE-4707: --- Hi Michael, I did what you suggested, but I ran into a related problem involving a race between the merge and the reader being returned so that I can protect the reference. In the log trace below, the thread that's executing IndexWriter.getReader gets stalled when maybeMerge is called, at which point the Lucene Merge Thread came in and deleted the files referred to the segmentInfos that the getReader call has already cloned, but since getReader has not returned yet, those files were not protected (incRef'ed) yet and the Lucene Merge Thread was able to delete the files. (I'm guessing in this case the file was created and merged within a softCommit cycle so the previous NRT reader/searcher never had a reference to it.) Questions: 1. What's my best way to get around that? 2. How does the OS-level file protection help in this case since the segmentInfos are just clone()ed in getReader and the call seems to just copy around references and never registered in any way with the directory? Thanks so much for your help again. Log: BD 129 [Thu Mar 14 17:41:23 PDT 2013; commitScheduler-333]: applyDeletes: infos=[_41(4.1):C2006, _9p(4.1):C17686, _3s(4.1):C1163, _4d(4.1):c313, _3y(4.1):c365, _4b(4.1):c423, _4a(4.1):C881, _4c(4.1):c54, _4f(4.1):c186, _4e(4.1):c30, _4g(4.1):c3, _ao(4.1):C3734, _ch(4.1):C3464, _d1(4.1):c708, _dk(4.1):c269, _dh(4.1):c36, _di(4.1):c4, _dj(4.1):c47, _dm(4.1):c3, _dl(4.1):c1, _dn(4.1):c1, _do(4.1):c1, _dp(4.1):c1, _dq(4.1):c49, _dr(4.1):c15, _ds(4.1):c1, _du(4.1):c101, _dv(4.1):c27, _dw(4.1):c3, _dx(4.1):c1, _dy(4.1):c1, _e0(4.1):c1, _dz(4.1):c1] packetCount=1\ ... BD 129 [Thu Mar 14 17:41:23 PDT 2013; commitScheduler-333]: seg=_d1(4.1):c708 segGen=763 coalesced deletes=[CoalescedDeletes(termSets=1,queries=0)] newDelCount=0\ ... IW 129 [Thu Mar 14 17:41:23 PDT 2013; commitScheduler-333]: return reader version=1013 reader=StandardDirectoryReader(segments_3:1013:nrt _41(4.1):C2006 _9p(4.1):C17686 _3s(4.1):C1163 _4d(4.1):c313 _3y(4.1):c365 _4b(4.1):c423 _4a(4.1):C881 _4c(4.1):c54 _4f(4.1):c186 _4e(4.1):c30 _4g(4.1):c3 _ao(4.1):C3734 _ch(4.1):C3464 _d1(4.1):c708 _dk(4.1):c269 _dh(4.1):c36 _di(4.1):c4 _dj(4.1):c47 _dm(4.1):c3 _dl(4.1):c1 _dn(4.1):c1 _do(4.1):c1 _dp(4.1):c1 _dq(4.1):c49 _dr(4.1):c15 _ds(4.1):c1 _du(4.1):c101 _dv(4.1):c27 _dw(4.1):c3 _dx(4.1):c1 _dy(4.1):c1 _e0(4.1):c1 _dz(4.1):c1)\ DW 129 [Thu Mar 14 17:41:23 PDT 2013; commitScheduler-333]: commitScheduler-333 finishFullFlush success=true\ TMP 129 [Thu Mar 14 17:41:23 PDT 2013; commitScheduler-333]: findMerges: 33 segments\ ... TMP 129 [Thu Mar 14 17:41:23 PDT 2013; commitScheduler-333]: seg=_d1(4.1):c708 size=1.144 MB [merging] [floored]\ … IW 129 [Thu Mar 14 17:41:23 PDT 2013; commitScheduler-333]: registerMerge merging= [_4c, _do, _dl, _4f, _41, _dn, _4g, _4d, _di, _4a, _4b, _dh, _3y, _dp , _4e, _dk, _dj, _d1, _dm, _3s, ]\ ... CMS 129 [Thu Mar 14 17:41:23 PDT 2013; commitScheduler-333]: index: _41(4.1):C2006 _9p(4.1):C17686 _3s(4.1):C1163 _4d(4.1):c313 _3y(4.1):c365 _4b(4.1):c423 _4a(4.1):C881 _4c(4.1):c54 _4f(4.1):c186 _4e(4.1):c30 _4g(4.1):c3 _ao(4.1):C3734 _ch(4.1):C3464 _d1(4.1):c708 _dk(4.1):c269 _dh(4.1):c36 _di(4.1):c4 _dj(4.1):c47 _dm(4.1):c3 _dl(4.1):c1 _dn(4.1):c1 _do(4.1):c1 _dp(4.1):c1 _dq(4.1):c49 _dr(4.1):c15 _ds(4.1):c1 _du(4.1):c101 _dv(4.1):c27 _dw(4.1):c3 _dx(4.1):c1 _dy(4.1):c1 _e0(4.1):c1 _dz(4.1):c1\ CMS 129 [Thu Mar 14 17:41:23 PDT 2013; commitScheduler-333]: too many merges; stalling...\ IFD 129 [Thu Mar 14 17:41:24 PDT 2013; Lucene Merge Thread #20]: DecRef _d1.cfs: pre-decr count is 1\ IFD 129 [Thu Mar 14 17:41:24 PDT 2013; Lucene Merge Thread #20]: delete _d1.cfs\ ... ...at this point commitScheduler-333 tries to incRef _d1.cfs but it's too late. Track file reference kept by readers that are opened through the writer --- Key: LUCENE-4707 URL: https://issues.apache.org/jira/browse/LUCENE-4707 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.0 Environment: Mac OS X 10.8.2 and Linux 2.6.32 Reporter: Jessica Cheng We ran into a bug where files (mostly CFS) that are still referred to by our NRT reader/searcher are deleted by IndexFileDeleter. As far as I can see from the verbose logging and reading the code, it seems that the problem is the creation and merging of these CFS files between hard commits. The files referred to by hard commits are incRef’ed at commit checkpoints, so these files won’t be deleted until they are decRef’ed when the commit is deleted according to the
[jira] [Commented] (SOLR-4586) Remove maxBooleanClauses from Solr
[ https://issues.apache.org/jira/browse/SOLR-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603749#comment-13603749 ] Mark Miller commented on SOLR-4586: --- It's been a long time, but as far as I remember, this isn't supposed to be a problem anymore. It's still used to limit BQ's in Lucene, but Solr shouldn't be creating those large BQ's - I think it's possibly a bug if we are. I think for all normal cases we should be using the smart multi term queries that were made to avoid this problem? I'd have to dig to be sure. I also thought I remember shawn saying in irc that he confirmed that no code was reading this setting in solr anymore. Remove maxBooleanClauses from Solr -- Key: SOLR-4586 URL: https://issues.apache.org/jira/browse/SOLR-4586 Project: Solr Issue Type: Improvement Affects Versions: 4.2 Reporter: Shawn Heisey Attachments: SOLR-4586.patch In the #solr IRC channel, I mentioned the maxBooleanClauses limitation to someone asking a question about queries. Mark Miller told me that maxBooleanClauses no longer applies, that the limitation was removed from Lucene sometime in the 3.x series. The config still shows up in the example even in the just-released 4.2. Checking through the source code, I found that the config option is parsed and the value stored in objects, but does not actually seem to be used by anything. I removed every trace of it that I could find, and all tests still pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4586) Remove maxBooleanClauses from Solr
[ https://issues.apache.org/jira/browse/SOLR-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603751#comment-13603751 ] Mark Miller commented on SOLR-4586: --- Basically, the idea is that a user should not need this setting or we still have work to do. Remove maxBooleanClauses from Solr -- Key: SOLR-4586 URL: https://issues.apache.org/jira/browse/SOLR-4586 Project: Solr Issue Type: Improvement Affects Versions: 4.2 Reporter: Shawn Heisey Attachments: SOLR-4586.patch In the #solr IRC channel, I mentioned the maxBooleanClauses limitation to someone asking a question about queries. Mark Miller told me that maxBooleanClauses no longer applies, that the limitation was removed from Lucene sometime in the 3.x series. The config still shows up in the example even in the just-released 4.2. Checking through the source code, I found that the config option is parsed and the value stored in objects, but does not actually seem to be used by anything. I removed every trace of it that I could find, and all tests still pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4318) NullPointerException encountered when /select query on solr.TextField.
[ https://issues.apache.org/jira/browse/SOLR-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-4318: - Attachment: SOLR-4138.patch Patch with CHANGES entry as well as code. NullPointerException encountered when /select query on solr.TextField. -- Key: SOLR-4318 URL: https://issues.apache.org/jira/browse/SOLR-4318 Project: Solr Issue Type: Bug Components: Build Affects Versions: 4.0 Reporter: Junaid Surve Assignee: Erick Erickson Labels: query, select Attachments: SOLR-4138.patch, SOLR-4318.patch I have two fields, one is title and the other is description in my Solr schema like - Type - fieldType name=text class=solr.TextField positionIncrementGap=100/ Declaration - field name=description type=text indexed=true stored=true/ without any tokenizer or filter. On querying /select?q=description:myText it works. However when I add a '*' it fails. Failure scenario - /select?q=description:* /select?q=description:myText* .. etc solrconfig.xml - requestHandler name=/select class=solr.SearchHandler lst name=defaults str name=echoParamsexplicit/str int name=rows10/int str name=dftitle/str /lst /requestHandler -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting
[ https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603758#comment-13603758 ] Andrew Muldowney commented on SOLR-2894: The issue lies in how the refinement requests were formatted and how they were parsed on the shard side, I've made changes that should alleviate this issue and I'll push out a patch soon Implement distributed pivot faceting Key: SOLR-2894 URL: https://issues.apache.org/jira/browse/SOLR-2894 Project: Solr Issue Type: Improvement Reporter: Erik Hatcher Fix For: 4.3 Attachments: SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894-reworked.patch Following up on SOLR-792, pivot faceting currently only supports undistributed mode. Distributed pivot faceting needs to be implemented. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-4318) NullPointerException encountered when /select query on solr.TextField.
[ https://issues.apache.org/jira/browse/SOLR-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson resolved SOLR-4318. -- Resolution: Fixed Fix Version/s: 5.0 4.3 Trunk r: 1457032 4xr: 1457077 Thanks for reporting this Junaid! NullPointerException encountered when /select query on solr.TextField. -- Key: SOLR-4318 URL: https://issues.apache.org/jira/browse/SOLR-4318 Project: Solr Issue Type: Bug Components: Build Affects Versions: 4.0 Reporter: Junaid Surve Assignee: Erick Erickson Labels: query, select Fix For: 4.3, 5.0 Attachments: SOLR-4138.patch, SOLR-4318.patch I have two fields, one is title and the other is description in my Solr schema like - Type - fieldType name=text class=solr.TextField positionIncrementGap=100/ Declaration - field name=description type=text indexed=true stored=true/ without any tokenizer or filter. On querying /select?q=description:myText it works. However when I add a '*' it fails. Failure scenario - /select?q=description:* /select?q=description:myText* .. etc solrconfig.xml - requestHandler name=/select class=solr.SearchHandler lst name=defaults str name=echoParamsexplicit/str int name=rows10/int str name=dftitle/str /lst /requestHandler -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4586) Remove maxBooleanClauses from Solr
[ https://issues.apache.org/jira/browse/SOLR-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603762#comment-13603762 ] Shawn Heisey commented on SOLR-4586: bq. I'd have to dig to be sure. I also thought I remember shawn saying in irc that he confirmed that no code was reading this setting in solr anymore. I was wrong about the value not actually being used anywhere. I think that can be attributed to not grokking Lucene internals and having only a short history with Java. I have since located the following bit of code that is removed from SolrCore.java by my patch. At the time it didn't look like anything important. {code} BooleanQuery.setMaxClauseCount(boolean_query_max_clause_count); {code} Remove maxBooleanClauses from Solr -- Key: SOLR-4586 URL: https://issues.apache.org/jira/browse/SOLR-4586 Project: Solr Issue Type: Improvement Affects Versions: 4.2 Reporter: Shawn Heisey Attachments: SOLR-4586.patch In the #solr IRC channel, I mentioned the maxBooleanClauses limitation to someone asking a question about queries. Mark Miller told me that maxBooleanClauses no longer applies, that the limitation was removed from Lucene sometime in the 3.x series. The config still shows up in the example even in the just-released 4.2. Checking through the source code, I found that the config option is parsed and the value stored in objects, but does not actually seem to be used by anything. I removed every trace of it that I could find, and all tests still pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4832) Unbounded getTopGroups for ToParentBlockJoinCollector
[ https://issues.apache.org/jira/browse/LUCENE-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603763#comment-13603763 ] Michael McCandless commented on LUCENE-4832: Hi Aleksey, reducing that method size would be nice! Can we just make it a new method (accumulate is good), instead of a new class? (And also the Integer.MAX_VALUE fix). I think this will be a good improvement... Unbounded getTopGroups for ToParentBlockJoinCollector - Key: LUCENE-4832 URL: https://issues.apache.org/jira/browse/LUCENE-4832 Project: Lucene - Core Issue Type: Improvement Components: modules/join Reporter: Aleksey Aleev Attachments: LUCENE-4832.patch _ToParentBlockJoinCollector#getTopGroups_ method takes several arguments: {code:java} public TopGroupsInteger getTopGroups(ToParentBlockJoinQuery query, Sort withinGroupSort, int offset, int maxDocsPerGroup, int withinGroupOffset, boolean fillSortFields) {code} and one of them is {{maxDocsPerGroup}} which specifies upper bound of child documents number returned within each group. {{ToParentBlockJoinCollector}} collects and caches all child documents matched by given {{ToParentBlockJoinQuery}} in {{OneGroup}} objects during search so it is possible to create {{GroupDocs}} with all matched child documents instead of part of them bounded by {{maxDocsPerGroup}}. When you specify {{maxDocsPerGroup}} new queues(I mean {{TopScoreDocCollector}}/{{TopFieldCollector}}) will be created for each group with {{maxDocsPerGroup}} objects created within each queue which could lead to redundant memory allocation in case of child documents number within group is less than {{maxDocsPerGroup}}. I suppose that there are many cases where you need to get all child documents matched by query so it could be nice to have ability to get top groups with all matched child documents without unnecessary memory allocation. Possible solution is to pass negative {{maxDocsPerGroup}} in case when you need to get all matched child documents within each group and check {{maxDocsPerGroup}} value: if it is negative then we need to create queue with size of matched child documents number; otherwise create queue with size equals to {{maxDocsPerGroup}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4586) Remove maxBooleanClauses from Solr
[ https://issues.apache.org/jira/browse/SOLR-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603770#comment-13603770 ] Mark Miller commented on SOLR-4586: --- Yup - that's the one. I tried finding a jira issue i was involved in from years ago about this setting, but couldn't dig it/them up. We worked hard to limit the problems it was causing lucene and solr users. I think it's kind of a crappy setting, always have, and it used to be a very common pain point before things got better. Anywhere sane should be using multi term queries that switch over to contant score and don't have this limitation. What did you do to trip this Shawn? Remove maxBooleanClauses from Solr -- Key: SOLR-4586 URL: https://issues.apache.org/jira/browse/SOLR-4586 Project: Solr Issue Type: Improvement Affects Versions: 4.2 Reporter: Shawn Heisey Attachments: SOLR-4586.patch In the #solr IRC channel, I mentioned the maxBooleanClauses limitation to someone asking a question about queries. Mark Miller told me that maxBooleanClauses no longer applies, that the limitation was removed from Lucene sometime in the 3.x series. The config still shows up in the example even in the just-released 4.2. Checking through the source code, I found that the config option is parsed and the value stored in objects, but does not actually seem to be used by anything. I removed every trace of it that I could find, and all tests still pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4318) NullPointerException encountered when /select query on solr.TextField.
[ https://issues.apache.org/jira/browse/SOLR-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603779#comment-13603779 ] Commit Tag Bot commented on SOLR-4318: -- [branch_4x commit] Erick Erickson http://svn.apache.org/viewvc?view=revisionrevision=1457077 SOLR-4318 NPE when doing a wildcard query on a TextField with the default analysis chain NullPointerException encountered when /select query on solr.TextField. -- Key: SOLR-4318 URL: https://issues.apache.org/jira/browse/SOLR-4318 Project: Solr Issue Type: Bug Components: Build Affects Versions: 4.0 Reporter: Junaid Surve Assignee: Erick Erickson Labels: query, select Fix For: 4.3, 5.0 Attachments: SOLR-4138.patch, SOLR-4318.patch I have two fields, one is title and the other is description in my Solr schema like - Type - fieldType name=text class=solr.TextField positionIncrementGap=100/ Declaration - field name=description type=text indexed=true stored=true/ without any tokenizer or filter. On querying /select?q=description:myText it works. However when I add a '*' it fails. Failure scenario - /select?q=description:* /select?q=description:myText* .. etc solrconfig.xml - requestHandler name=/select class=solr.SearchHandler lst name=defaults str name=echoParamsexplicit/str int name=rows10/int str name=dftitle/str /lst /requestHandler -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4586) Remove maxBooleanClauses from Solr
[ https://issues.apache.org/jira/browse/SOLR-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603784#comment-13603784 ] Shawn Heisey commented on SOLR-4586: bq. What did you do to trip this Shawn? I was just warning someone on IRC about the existence of the 1024-clause limit, then you mentioned it doesn't exist any more. After that we discussed whether or not to remove it from Solr. And then there's us escaping now. -- Wheatley Remove maxBooleanClauses from Solr -- Key: SOLR-4586 URL: https://issues.apache.org/jira/browse/SOLR-4586 Project: Solr Issue Type: Improvement Affects Versions: 4.2 Reporter: Shawn Heisey Attachments: SOLR-4586.patch In the #solr IRC channel, I mentioned the maxBooleanClauses limitation to someone asking a question about queries. Mark Miller told me that maxBooleanClauses no longer applies, that the limitation was removed from Lucene sometime in the 3.x series. The config still shows up in the example even in the just-released 4.2. Checking through the source code, I found that the config option is parsed and the value stored in objects, but does not actually seem to be used by anything. I removed every trace of it that I could find, and all tests still pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3857) DIH: SqlEntityProcessor with simple cache broken
[ https://issues.apache.org/jira/browse/SOLR-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Dyer updated SOLR-3857: - Attachment: SOLR-3857.patch Here is a working patch based on the fix Sudheer Prem suggested on SOLR-4561. All tests pass and it restores pre-3.6 functionality. The way this feature works (and always has) is by creating a new cache for every key. If using the default cache impl, this means a 1-element SortedMap in memory in addition to your data. In addition all of these 1-element caches are kept in a map, keyed by the query text with tokens replaced. This is why Sudheer's fix needs to replace tokens first and then see if the cache exists second, because each version of the query gets its own cache. Using SortedMapBackedCache (the default), this is merely a memory waste (and possibly a net gain if you are caching far less data). But the point of the recent cache refactorings is to allow for pluggable cache implementations, including those that persist data to disk. Clearly this behavior is not going to work for the general case. While the way it ought to work is easy to conceptualize, the DIH structure doesn't make it easy. The query's tokens get replaced several calls up the stack from the cache layer. Those who want this functionality can apply and build with this patch. But perhaps a better way is simply to put a subselect in your child entity query. For instance: {code:xml} entity name=parent query=SELECT * FROM PARENT pk=ID entity name=child cacheImpl=SortedMapBackedCache query=SELECT * FROM CHILD WHERE CHILD_ID IN (SELECT CHILD_ID FROM PARENT) / /entity {code} Although this does not give you lazy loading, it does cause only the needed data to be cached. DIH: SqlEntityProcessor with simple cache broken -- Key: SOLR-3857 URL: https://issues.apache.org/jira/browse/SOLR-3857 Project: Solr Issue Type: Bug Affects Versions: 3.6.1, 4.0-BETA Reporter: James Dyer Attachments: SOLR-3857.patch The wiki describes a usage of CachedSqlEntityProcessor like this: {code:xml} entity name=y query=select * from y where xid=${x.id} processor=CachedSqlEntityProcessor {code} This creates what the code refers as a simple cache. Rather than build the entire cache up-front, the cache is built on-the-go. I think this has limited use cases but it would be nice to preserve the feature if possible. Unfortunately this was not included in any (effective) unit tests, and SOLR-2382 entirely broke the functionality for 3.6/4.0-alpha+ . At a first glance, the fix may not be entirely straightforward. This was found while writing tests for SOLR-3856. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4590) Collections API should return a nice error when not in SolrCloud mode.
Mark Miller created SOLR-4590: - Summary: Collections API should return a nice error when not in SolrCloud mode. Key: SOLR-4590 URL: https://issues.apache.org/jira/browse/SOLR-4590 Project: Solr Issue Type: Bug Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.3, 5.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4707) Track file reference kept by readers that are opened through the writer
[ https://issues.apache.org/jira/browse/LUCENE-4707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603787#comment-13603787 ] Michael McCandless commented on LUCENE-4707: Hi Jessica, How about changing the approach in your Directory wrapper. Instead of incRef'ing when you get an NRTReader, incRef whenever openInput is called, and refuse to delete the file is it's still held open by anything (throw an IOException in .deleteFile: IndexWriter catches this and will retry the deletion later). This will make Unix behave like Windows, ie still-open files cannot be deleted. I think that should fix this race condition, because the NRT reader must first .openFile all files it uses ... Track file reference kept by readers that are opened through the writer --- Key: LUCENE-4707 URL: https://issues.apache.org/jira/browse/LUCENE-4707 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.0 Environment: Mac OS X 10.8.2 and Linux 2.6.32 Reporter: Jessica Cheng We ran into a bug where files (mostly CFS) that are still referred to by our NRT reader/searcher are deleted by IndexFileDeleter. As far as I can see from the verbose logging and reading the code, it seems that the problem is the creation and merging of these CFS files between hard commits. The files referred to by hard commits are incRef’ed at commit checkpoints, so these files won’t be deleted until they are decRef’ed when the commit is deleted according to the DeletionPolicy (good). However, intermediate files that are created and merged between the hard commits only have refs through the regular checkpoints, so as soon as a new checkpoint no longer includes those files, they are immediately deleted by the deleter. See the abridged verbose log lines that illustrate this behavior: IW 11 [Mon Jan 21 17:30:35 PST 2013; commitScheduler]: create compound file _8.cfs IFD 7 [Mon Jan 21 17:23:41 PST 2013; commitScheduler]: now checkpoint _0(4.0.0.2):C3_1(4.0.0.2):C7 _2(4.0.0.2):C16 _3(4.0.0.2):C21 _4(4.0.0.2):C5 _5(4.0.0.2):C5_6(4.0.0.2):C5 _7(4.0.0.2):C7 _8(4.0.0.2):c6 [9 segments ; isCommit = false] IFD 7 [Mon Jan 21 17:23:41 PST 2013; commitScheduler]: IncRef _8.cfs: pre-incr count is 0 IFD 7 [Mon Jan 21 17:23:42 PST 2013; commitScheduler]: now checkpoint _0(4.0.0.2):C3_1(4.0.0.2):C7 _2(4.0.0.2):C16 _3(4.0.0.2):C21 _4(4.0.0.2):C5 _5(4.0.0.2):C5 _6(4.0.0.2):C5 _7(4.0.0.2):C7 _8(4.0.0.2):c6 _9(4.0.0.2):c6 [10 segments ; isCommit = false] IFD 7 [Mon Jan 21 17:23:42 PST 2013; commitScheduler]: IncRef _8.cfs: pre-incr count is 1 IFD 7 [Mon Jan 21 17:23:42 PST 2013; commitScheduler]: DecRef _8.cfs: pre-decr count is 2 IFD 7 [Mon Jan 21 17:23:42 PST 2013; Lucene Merge Thread #0]: now checkpoint _b(4.0.0.2):C81 [1 segments ; isCommit = false] IFD 7 [Mon Jan 21 17:23:42 PST 2013; Lucene Merge Thread #0]: DecRef _8.cfs: pre-decr count is 1 IFD 7 [Mon Jan 21 17:23:42 PST 2013; Lucene Merge Thread #0]: delete _8.cfs With this behavior, it seems no matter how frequently we refresh the reader (unless we do it at every read), we’d run into the race where the reader still holds a reference to the file that’s just been deleted by the deleter. My proposal is to count the file reference handed out to the NRT reader/searcher when writer.getReader(boolean) is called and decRef the files only when the said reader is closed. Please take a look and evaluate if my observations are correct and if the proposal makes sense. Thanks! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4539) Consistently failing seed for SyncSliceTest
[ https://issues.apache.org/jira/browse/SOLR-4539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603793#comment-13603793 ] Mark Miller commented on SOLR-4539: --- This seems to be a problem with ephemeral dir factories - I'm guessing the log clearing is still not quite working right. Consistently failing seed for SyncSliceTest --- Key: SOLR-4539 URL: https://issues.apache.org/jira/browse/SOLR-4539 Project: Solr Issue Type: Bug Reporter: Shawn Heisey Assignee: Mark Miller Fix For: 4.3, 5.0 http://mail-archives.us.apache.org/mod_mbox/lucene-dev/201303.mbox/%3c513933dd.5000...@elyograg.org%3E {quote} [junit4:junit4] 2 NOTE: reproduce with: ant test -Dtestcase=SyncSliceTest -Dtests.method=testDistribSearch -Dtests.seed=1D1206F80A77FE6F -Dtests.nightly=true -Dtests.weekly=true -Dtests.slow=true -Dtests.locale=ar_LY -Dtests.timezone=BET -Dtests.file.encoding=UTF-8 [junit4:junit4] FAILURE 109s | SyncSliceTest.testDistribSearch [junit4:junit4] Throwable #1: java.lang.AssertionError: shard1 is not consistent. Got 305 from http://127.0.0.1:44083/collection1lastClient and got 5 from http://127.0.0.1:43445/collection1 [junit4:junit4]at __randomizedtesting.SeedInfo.seed([1D1206F80A77FE6F:9CF488E07D289E53]:0) [junit4:junit4]at org.junit.Assert.fail(Assert.java:93) [junit4:junit4]at org.apache.solr.cloud.AbstractFullDistribZkTestBase.checkShardConsistency(AbstractFullDistribZkTestBase.java:963) [junit4:junit4]at org.apache.solr.cloud.SyncSliceTest.doTest(SyncSliceTest.java:234) [junit4:junit4]at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:806) {quote} (issue files by Hoss on Shawn's behalf so we don't lose track of it) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4586) Remove maxBooleanClauses from Solr
[ https://issues.apache.org/jira/browse/SOLR-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603795#comment-13603795 ] Mark Miller commented on SOLR-4586: --- I mean this: bq. I tried the patched Solr out after removed maxBooleanClauses from my config, and a 1500-clause query fails, saying too many clauses. Remove maxBooleanClauses from Solr -- Key: SOLR-4586 URL: https://issues.apache.org/jira/browse/SOLR-4586 Project: Solr Issue Type: Improvement Affects Versions: 4.2 Reporter: Shawn Heisey Attachments: SOLR-4586.patch In the #solr IRC channel, I mentioned the maxBooleanClauses limitation to someone asking a question about queries. Mark Miller told me that maxBooleanClauses no longer applies, that the limitation was removed from Lucene sometime in the 3.x series. The config still shows up in the example even in the just-released 4.2. Checking through the source code, I found that the config option is parsed and the value stored in objects, but does not actually seem to be used by anything. I removed every trace of it that I could find, and all tests still pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4795) Add FacetsCollector based on SortedSetDocValues
[ https://issues.apache.org/jira/browse/LUCENE-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-4795. Resolution: Fixed Fix Version/s: 4.3 5.0 Add FacetsCollector based on SortedSetDocValues --- Key: LUCENE-4795 URL: https://issues.apache.org/jira/browse/LUCENE-4795 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, 4.3 Attachments: LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, pleaseBenchmarkMe.patch Recently (LUCENE-4765) we added multi-valued DocValues field (SortedSetDocValuesField), and this can be used for faceting in Solr (SOLR-4490). I think we should also add support in the facet module? It'd be an option with different tradeoffs. Eg, it wouldn't require the taxonomy index, since the main index handles label/ord resolving. There are at least two possible approaches: * On every reopen, build the seg - global ord map, and then on every collect, get the seg ord, map it to the global ord space, and increment counts. This adds cost during reopen in proportion to number of unique terms ... * On every collect, increment counts based on the seg ords, and then do a merge in the end just like distributed faceting does. The first approach is much easier so I built a quick prototype using that. The prototype does the counting, but it does NOT do the top K facets gathering in the end, and it doesn't know parent/child ord relationships, so there's tons more to do before this is real. I also was unsure how to properly integrate it since the existing classes seem to expect that you use a taxonomy index to resolve ords. I ran a quick performance test. base = trunk except I disabled the compute top-K in FacetsAccumulator to make the comparison fair; comp = using the prototype collector in the patch: {noformat} TaskQPS base StdDevQPS comp StdDev Pct diff OrHighLow 18.79 (2.5%) 14.36 (3.3%) -23.6% ( -28% - -18%) HighTerm 21.58 (2.4%) 16.53 (3.7%) -23.4% ( -28% - -17%) OrHighMed 18.20 (2.5%) 13.99 (3.3%) -23.2% ( -28% - -17%) Prefix3 14.37 (1.5%) 11.62 (3.5%) -19.1% ( -23% - -14%) LowTerm 130.80 (1.6%) 106.95 (2.4%) -18.2% ( -21% - -14%) OrHighHigh9.60 (2.6%)7.88 (3.5%) -17.9% ( -23% - -12%) AndHighHigh 24.61 (0.7%) 20.74 (1.9%) -15.7% ( -18% - -13%) Fuzzy1 49.40 (2.5%) 43.48 (1.9%) -12.0% ( -15% - -7%) MedSloppyPhrase 27.06 (1.6%) 23.95 (2.3%) -11.5% ( -15% - -7%) MedTerm 51.43 (2.0%) 46.21 (2.7%) -10.2% ( -14% - -5%) IntNRQ4.02 (1.6%)3.63 (4.0%) -9.7% ( -15% - -4%) Wildcard 29.14 (1.5%) 26.46 (2.5%) -9.2% ( -13% - -5%) HighSloppyPhrase0.92 (4.5%)0.87 (5.8%) -5.4% ( -15% -5%) MedSpanNear 29.51 (2.5%) 27.94 (2.2%) -5.3% ( -9% -0%) HighSpanNear3.55 (2.4%)3.38 (2.0%) -4.9% ( -9% -0%) AndHighMed 108.34 (0.9%) 104.55 (1.1%) -3.5% ( -5% - -1%) LowSloppyPhrase 20.50 (2.0%) 20.09 (4.2%) -2.0% ( -8% -4%) LowPhrase 21.60 (6.0%) 21.26 (5.1%) -1.6% ( -11% - 10%) Fuzzy2 53.16 (3.9%) 52.40 (2.7%) -1.4% ( -7% -5%) LowSpanNear8.42 (3.2%)8.45 (3.0%) 0.3% ( -5% -6%) Respell 45.17 (4.3%) 45.38 (4.4%) 0.5% ( -7% -9%) MedPhrase 113.93 (5.8%) 115.02 (4.9%) 1.0% ( -9% - 12%) AndHighLow 596.42 (2.5%) 617.12 (2.8%) 3.5% ( -1% -8%) HighPhrase 17.30 (10.5%) 18.36 (9.1%) 6.2% ( -12% - 28%) {noformat} I'm impressed that this approach is only ~24% slower in the worst case! I think this means it's a good option to make available? Yes it has downsides (NRT reopen more costly, small added RAM usage, slightly slower faceting),
[jira] [Commented] (LUCENE-4795) Add FacetsCollector based on SortedSetDocValues
[ https://issues.apache.org/jira/browse/LUCENE-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603796#comment-13603796 ] Commit Tag Bot commented on LUCENE-4795: [trunk commit] Michael McCandless http://svn.apache.org/viewvc?view=revisionrevision=1457092 LUCENE-4795: add new facet method to facet from SortedSetDocValues without using taxonomy index Add FacetsCollector based on SortedSetDocValues --- Key: LUCENE-4795 URL: https://issues.apache.org/jira/browse/LUCENE-4795 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, 4.3 Attachments: LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, pleaseBenchmarkMe.patch Recently (LUCENE-4765) we added multi-valued DocValues field (SortedSetDocValuesField), and this can be used for faceting in Solr (SOLR-4490). I think we should also add support in the facet module? It'd be an option with different tradeoffs. Eg, it wouldn't require the taxonomy index, since the main index handles label/ord resolving. There are at least two possible approaches: * On every reopen, build the seg - global ord map, and then on every collect, get the seg ord, map it to the global ord space, and increment counts. This adds cost during reopen in proportion to number of unique terms ... * On every collect, increment counts based on the seg ords, and then do a merge in the end just like distributed faceting does. The first approach is much easier so I built a quick prototype using that. The prototype does the counting, but it does NOT do the top K facets gathering in the end, and it doesn't know parent/child ord relationships, so there's tons more to do before this is real. I also was unsure how to properly integrate it since the existing classes seem to expect that you use a taxonomy index to resolve ords. I ran a quick performance test. base = trunk except I disabled the compute top-K in FacetsAccumulator to make the comparison fair; comp = using the prototype collector in the patch: {noformat} TaskQPS base StdDevQPS comp StdDev Pct diff OrHighLow 18.79 (2.5%) 14.36 (3.3%) -23.6% ( -28% - -18%) HighTerm 21.58 (2.4%) 16.53 (3.7%) -23.4% ( -28% - -17%) OrHighMed 18.20 (2.5%) 13.99 (3.3%) -23.2% ( -28% - -17%) Prefix3 14.37 (1.5%) 11.62 (3.5%) -19.1% ( -23% - -14%) LowTerm 130.80 (1.6%) 106.95 (2.4%) -18.2% ( -21% - -14%) OrHighHigh9.60 (2.6%)7.88 (3.5%) -17.9% ( -23% - -12%) AndHighHigh 24.61 (0.7%) 20.74 (1.9%) -15.7% ( -18% - -13%) Fuzzy1 49.40 (2.5%) 43.48 (1.9%) -12.0% ( -15% - -7%) MedSloppyPhrase 27.06 (1.6%) 23.95 (2.3%) -11.5% ( -15% - -7%) MedTerm 51.43 (2.0%) 46.21 (2.7%) -10.2% ( -14% - -5%) IntNRQ4.02 (1.6%)3.63 (4.0%) -9.7% ( -15% - -4%) Wildcard 29.14 (1.5%) 26.46 (2.5%) -9.2% ( -13% - -5%) HighSloppyPhrase0.92 (4.5%)0.87 (5.8%) -5.4% ( -15% -5%) MedSpanNear 29.51 (2.5%) 27.94 (2.2%) -5.3% ( -9% -0%) HighSpanNear3.55 (2.4%)3.38 (2.0%) -4.9% ( -9% -0%) AndHighMed 108.34 (0.9%) 104.55 (1.1%) -3.5% ( -5% - -1%) LowSloppyPhrase 20.50 (2.0%) 20.09 (4.2%) -2.0% ( -8% -4%) LowPhrase 21.60 (6.0%) 21.26 (5.1%) -1.6% ( -11% - 10%) Fuzzy2 53.16 (3.9%) 52.40 (2.7%) -1.4% ( -7% -5%) LowSpanNear8.42 (3.2%)8.45 (3.0%) 0.3% ( -5% -6%) Respell 45.17 (4.3%) 45.38 (4.4%) 0.5% ( -7% -9%) MedPhrase 113.93 (5.8%) 115.02 (4.9%) 1.0% ( -9% - 12%) AndHighLow 596.42 (2.5%) 617.12 (2.8%) 3.5% ( -1% -8%) HighPhrase 17.30 (10.5%) 18.36 (9.1%) 6.2% ( -12% - 28%) {noformat} I'm impressed that this approach is only ~24% slower