Build failed in Hudson: Solr-trunk #996
See http://hudson.zones.apache.org/hudson/job/Solr-trunk/996/ -- [...truncated 2229 lines...] [junit] Running org.apache.solr.client.solrj.SolrExceptionTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.516 sec [junit] Running org.apache.solr.client.solrj.SolrQueryTest [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.395 sec [junit] Running org.apache.solr.client.solrj.TestBatchUpdate [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 21.814 sec [junit] Running org.apache.solr.client.solrj.TestLBHttpSolrServer [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 15.991 sec [junit] Running org.apache.solr.client.solrj.beans.TestDocumentObjectBinder [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.694 sec [junit] Running org.apache.solr.client.solrj.embedded.JettyWebappTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 9.206 sec [junit] Running org.apache.solr.client.solrj.embedded.LargeVolumeBinaryJettyTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 10.656 sec [junit] Running org.apache.solr.client.solrj.embedded.LargeVolumeEmbeddedTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 4.085 sec [junit] Running org.apache.solr.client.solrj.embedded.LargeVolumeJettyTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 11.553 sec [junit] Running org.apache.solr.client.solrj.embedded.MergeIndexesEmbeddedTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 7.447 sec [junit] Running org.apache.solr.client.solrj.embedded.MultiCoreEmbeddedTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 4.03 sec [junit] Running org.apache.solr.client.solrj.embedded.MultiCoreExampleJettyTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 6.146 sec [junit] Running org.apache.solr.client.solrj.embedded.SolrExampleEmbeddedTest [junit] Tests run: 9, Failures: 0, Errors: 0, Time elapsed: 21.038 sec [junit] Running org.apache.solr.client.solrj.embedded.SolrExampleJettyTest [junit] Tests run: 10, Failures: 0, Errors: 0, Time elapsed: 38.007 sec [junit] Running org.apache.solr.client.solrj.embedded.SolrExampleStreamingTest [junit] Tests run: 9, Failures: 0, Errors: 0, Time elapsed: 35.399 sec [junit] Running org.apache.solr.client.solrj.embedded.TestSolrProperties [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 8.237 sec [junit] Running org.apache.solr.client.solrj.request.TestUpdateRequestCodec [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.417 sec [junit] Running org.apache.solr.client.solrj.response.AnlysisResponseBaseTest [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.484 sec [junit] Running org.apache.solr.client.solrj.response.DocumentAnalysisResponseTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.392 sec [junit] Running org.apache.solr.client.solrj.response.FieldAnalysisResponseTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.358 sec [junit] Running org.apache.solr.client.solrj.response.QueryResponseTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.58 sec [junit] Running org.apache.solr.client.solrj.response.TestSpellCheckResponse [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 8.16 sec [junit] Running org.apache.solr.client.solrj.util.ClientUtilsTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.506 sec [junit] Running org.apache.solr.common.SolrDocumentTest [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.458 sec [junit] Running org.apache.solr.common.params.ModifiableSolrParamsTest [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.481 sec [junit] Running org.apache.solr.common.params.SolrParamTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.447 sec [junit] Running org.apache.solr.common.util.ContentStreamTest [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.554 sec [junit] Running org.apache.solr.common.util.DOMUtilTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.467 sec [junit] Running org.apache.solr.common.util.FileUtilsTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.61 sec [junit] Running org.apache.solr.common.util.IteratorChainTest [junit] Tests run: 7, Failures: 0, Errors: 0, Time elapsed: 0.405 sec [junit] Running org.apache.solr.common.util.NamedListTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.597 sec [junit] Running org.apache.solr.common.util.TestFastInputStream [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.392 sec [junit] Running org.apache.solr.common.util.TestHash [junit] Tests run: 2, Failures: 0, Errors: 0,
Re: minor nit....
I've just updated to the latest 1.4 head and found that my log is only showing lines like: INFO: {add=[x0,x1,x2,x3,x4,x5,x6,x7, ... (8 added)]} 0 13627 Debugging LogUpdateProcessorFactory; adds.size() is always 8 while numAdds is a number in the hundreds, depending on the added list of documents. I suspect that numAdds is the number I'm looking for in my logs. Could we change LogUpdateProcessorFactory to: @@ -162,10 +162,10 @@ // if id lists were truncated, show how many more there were if (adds != null numAdds maxNumToLog) { - adds.add(... ( + adds.size() + added)); + adds.add(... ( + numAdds + added)); } if (deletes != null numDeletes maxNumToLog) { - deletes.add(... ( + deletes.size() + removed)); + deletes.add(... ( + numDeletes + removed)); } long elapsed = rsp.getEndTime() - req.getStartTime(); log.info( +toLog + 0 + (elapsed) ); Thanks Thijs On 22-10-2009 5:59, Yonik Seeley wrote: On Wed, Oct 21, 2009 at 11:44 PM, Ryan McKinleyryan...@gmail.com wrote: I'm looking through a bunch of logs that have: UpdateRequestProcessor - {add=[aa, bb, cc, dd, ee, ff, gg, hh, ...(142 more)]} Would it be more reasonable to say: 150 total rather then make you count the previous 8? Yep, I did reconsider that at some point... just never got to the threshold of doing something about it :-) -Yonik http://www.lucidimagination.com
[jira] Created: (SOLR-1627) Variableresolver should be fetched just in time
Variableresolver should be fetched just in time --- Key: SOLR-1627 URL: https://issues.apache.org/jira/browse/SOLR-1627 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Reporter: Noble Paul Priority: Minor Fix For: 1.5 The VariableResolver instance may vary from time to time from SOLR-1352. So get it Just in time. For most cases iuse Context#resolve() and Context#replaceTokens() -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1627) Variableresolver should be fetched just in time
[ https://issues.apache.org/jira/browse/SOLR-1627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-1627: - Attachment: SOLR-1627.patch Variableresolver should be fetched just in time --- Key: SOLR-1627 URL: https://issues.apache.org/jira/browse/SOLR-1627 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Reporter: Noble Paul Assignee: Noble Paul Priority: Minor Fix For: 1.5 Attachments: SOLR-1627.patch The VariableResolver instance may vary from time to time from SOLR-1352. So get it Just in time. For most cases iuse Context#resolve() and Context#replaceTokens() -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (SOLR-1627) Variableresolver should be fetched just in time
[ https://issues.apache.org/jira/browse/SOLR-1627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul reassigned SOLR-1627: Assignee: Noble Paul Variableresolver should be fetched just in time --- Key: SOLR-1627 URL: https://issues.apache.org/jira/browse/SOLR-1627 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Reporter: Noble Paul Assignee: Noble Paul Priority: Minor Fix For: 1.5 Attachments: SOLR-1627.patch The VariableResolver instance may vary from time to time from SOLR-1352. So get it Just in time. For most cases iuse Context#resolve() and Context#replaceTokens() -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [jira] Issue Comment Edited: (SOLR-236) Field collapsing
Hey there, I have beeb testing the last patch and I think or I am missing something or the way to show the collapsed documents when adjacent collapse can be sometimes confusing: I am using the patch replacing queryComponent for collapseComponent (not using both at same time): searchComponent name=query class=org.apache.solr.handler.component.CollapseComponent What I have noticed is, imagin you get these results in the search: doc1: id:001 collapseField:ccc doc2: id:002 collapseField:aaa doc3: id:003 collapseField:ccc doc4: id:004 collapseField:bbb And in the collapse_counts you get: int name=collapseCount1/int str name=fieldValueccc/str result name=collapsedDocs numFound=1 start=0 doc long name=id008/long str name=contentaaa aaa/str str name=colccc/str /doc /result Now, how can I know the head document of doc 008? Both 001 and 003 could be... wouldn't make sense to connect in someway the uniqueField with the collapsed documents? Adding something to collapse_counts like: int name=collapseCount1/int str name=fieldValueccc/str str name=uniqueFieldId003/str I currently have hacked FieldValueCountCollapseCollectorFactory to return: str name=fieldValueccc#003/str but this respose looks dirty... As I said maybe I am missunderstanding something and this can be knwon in someway. In that case can someone tell me how? Thanks in advance JIRA j...@apache.org wrote: [ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12783484#action_12783484 ] Martijn van Groningen edited comment on SOLR-236 at 11/29/09 9:56 PM: -- I have attached a new patch that has the following changes: # Added caching for the field collapse functionality. Check the [solr wiki|http://wiki.apache.org/solr/FieldCollapsing] for how to configure field-collapsing with caching. # Removed the collapse.max parameter (collapse.threshold must be used instead). It was deprecated for a long time. was (Author: martijn): I have attached a new patch that has the following changes: # Added caching for the field collapse functionality. Check the [solr wiki|http://wiki.apache.org/solr/FieldCollapsing] for how to configure the field-collapsing with caching. # Removed the collapse.max parameter (collapse.threshold must be used instead). It was deprecated for a long time. Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. -- View this message in context:
[jira] Updated: (SOLR-1627) Variableresolver should be fetched just in time
[ https://issues.apache.org/jira/browse/SOLR-1627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-1627: - Attachment: SOLR-1627.patch Variableresolver should be fetched just in time --- Key: SOLR-1627 URL: https://issues.apache.org/jira/browse/SOLR-1627 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Reporter: Noble Paul Assignee: Noble Paul Priority: Minor Fix For: 1.5 Attachments: SOLR-1627.patch, SOLR-1627.patch The VariableResolver instance may vary from time to time from SOLR-1352. So get it Just in time. For most cases iuse Context#resolve() and Context#replaceTokens() -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [jira] Issue Comment Edited: (SOLR-236) Field collapsing
Hi Marc, I'm not sure if I follow you completely, but the example you gave is not complete. I'm missing a few tags in your example. Lets assume the following response that the latest patches produce. lst name=collapse_counts str name=fieldcat/str lst name=results lst name=009 str name=fieldValuehard/str int name=collapseCount1/int result name=collapsedDocs numFound=1 start=0 doc long name=id008/long str name=contentaaa aaa/str str name=colccc/str /doc /result /lst ... /lst /lst The result list contains collapse groups. The name of the child elements are the collapse head ids. Everything that falls under the collapse head belongs to that collapse group and thus adding document head id to the field value is unnecessary. In the above example document with id 009 is the document head of document with id 008. Document with id 009 should be displayed in the search result. From what you have said, it seems that you properly configured the patch. Martijn 2009/12/7 Marc Sturlese marc.sturl...@gmail.com: Hey there, I have beeb testing the last patch and I think or I am missing something or the way to show the collapsed documents when adjacent collapse can be sometimes confusing: I am using the patch replacing queryComponent for collapseComponent (not using both at same time): searchComponent name=query class=org.apache.solr.handler.component.CollapseComponent What I have noticed is, imagin you get these results in the search: doc1: id:001 collapseField:ccc doc2: id:002 collapseField:aaa doc3: id:003 collapseField:ccc doc4: id:004 collapseField:bbb And in the collapse_counts you get: int name=collapseCount1/int str name=fieldValueccc/str result name=collapsedDocs numFound=1 start=0 doc long name=id008/long str name=contentaaa aaa/str str name=colccc/str /doc /result Now, how can I know the head document of doc 008? Both 001 and 003 could be... wouldn't make sense to connect in someway the uniqueField with the collapsed documents? Adding something to collapse_counts like: int name=collapseCount1/int str name=fieldValueccc/str str name=uniqueFieldId003/str I currently have hacked FieldValueCountCollapseCollectorFactory to return: str name=fieldValueccc#003/str but this respose looks dirty... As I said maybe I am missunderstanding something and this can be knwon in someway. In that case can someone tell me how? Thanks in advance JIRA j...@apache.org wrote: [ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12783484#action_12783484 ] Martijn van Groningen edited comment on SOLR-236 at 11/29/09 9:56 PM: -- I have attached a new patch that has the following changes: # Added caching for the field collapse functionality. Check the [solr wiki|http://wiki.apache.org/solr/FieldCollapsing] for how to configure field-collapsing with caching. # Removed the collapse.max parameter (collapse.threshold must be used instead). It was deprecated for a long time. was (Author: martijn): I have attached a new patch that has the following changes: # Added caching for the field collapse functionality. Check the [solr wiki|http://wiki.apache.org/solr/FieldCollapsing] for how to configure the field-collapsing with caching. # Removed the collapse.max parameter (collapse.threshold must be used instead). It was deprecated for a long time. Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch,
[jira] Created: (SOLR-1628) log contains incorrect number of adds and deletes
log contains incorrect number of adds and deletes - Key: SOLR-1628 URL: https://issues.apache.org/jira/browse/SOLR-1628 Project: Solr Issue Type: Bug Affects Versions: 1.4 Reporter: Yonik Seeley Fix For: 1.5 LogUpdateProcessorFactory logs the wrong number of deletes/adds if more than 8. http://search.lucidimagination.com/search/document/f75c6a5a58e205a4/minor_nit -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [jira] Issue Comment Edited: (SOLR-236) Field collapsing
Yes it should look similar to that. What is the exact request you send to Solr? Also to check if the patch works correctly can you run: ant clean test There are a number of tests that test the Field collapse functionality. Martijn 2009/12/7 Marc Sturlese marc.sturl...@gmail.com: lst name=collapse_counts str name=fieldcat/str lst name=results lst name=009 str name=fieldValuehard/str int name=collapseCount1/int result name=collapsedDocs numFound=1 start=0 doc long name=id008/long str name=contentaaa aaa/str str name=colccc/str /doc /result /lst ... /lst /lst I see, looks like I am applying the patch wrongly somehow. This the complete collapse_counts response I am getting: lst name=collapse_counts str name=fieldcol/str lst name=results lst int name=collapseCount1/int int name=collapseCount1/int int name=collapseCount1/int str name=fieldValuebbb/str str name=fieldValueccc/str str name=fieldValuexxx/str result name=collapsedDocs numFound=1 start=0 doc long name=id2/long str name=contentaaa aaa/str str name=colbbb/str /doc /result result name=collapsedDocs numFound=1 start=0 doc long name=id8/long str name=contentaaa aaa aaa sd/str str name=colccc/str /doc /result result name=collapsedDocs numFound=4 start=0 doc long name=id12/long str name=contentaaa aaa aaa v/str str name=colxxx/str /doc /result /lst /lst /lst As you can see I am getting a lst tag with no name. As I understood what you told me. I should be getting as many lst tags as collapsed groups and the name attribute of the lst should be the unique field value. So, if the patch was applyed correcly teh response should look like: lst name=collapse_counts str name=fieldcol/str lst name=results lst name=354 (the head value of the collapsed group) int name=collapseCount1/int str name=fieldValuebbb/str result name=collapsedDocs numFound=1 start=0 doc long name=id2/long str name=contentaaa aaa/str str name=colbbb/str /doc /result /lst lst name=654 int name=collapseCount1/int str name=fieldValueccc/str result name=collapsedDocs numFound=1 start=0 doc long name=id8/long str name=contentaaa aaa aaa sd/str str name=colccc/str /doc /result /lst lst name=654 int name=collapseCount1/int str name=fieldValuexxx/str result name=collapsedDocs numFound=4 start=0 doc long name=id12/long str name=contentaaa aaa aaa v/str str name=colxxx/str /doc /result /lst /lst /lst Is this the way the response looks like when you use teh patch? Thanks in advance Martijn v Groningen wrote: Hi Marc, I'm not sure if I follow you completely, but the example you gave is not complete. I'm missing a few tags in your example. Lets assume the following response that the latest patches produce. lst name=collapse_counts str name=fieldcat/str lst name=results lst name=009 str name=fieldValuehard/str int name=collapseCount1/int result name=collapsedDocs numFound=1 start=0 doc long name=id008/long str name=contentaaa aaa/str str name=colccc/str /doc /result /lst ... /lst /lst The result list contains collapse groups. The name of the child elements are the collapse head ids. Everything that falls under the collapse head belongs to that collapse group and thus adding document head id to the field value is unnecessary. In the above example document with id 009 is the document head of document with id 008. Document with id 009 should be displayed in the search result. From what you have said, it seems that you properly configured the patch. Martijn 2009/12/7 Marc Sturlese marc.sturl...@gmail.com: Hey there, I have beeb testing the last patch and I think or I am missing something or the way to show the collapsed documents when adjacent collapse can be sometimes confusing: I am using the patch replacing queryComponent for collapseComponent (not using both at same time): searchComponent name=query class=org.apache.solr.handler.component.CollapseComponent What I have noticed is, imagin you get these results in the search: doc1: id:001 collapseField:ccc doc2: id:002 collapseField:aaa doc3: id:003 collapseField:ccc doc4: id:004 collapseField:bbb And in the collapse_counts you get: int name=collapseCount1/int
[jira] Updated: (SOLR-1131) Allow a single field type to index multiple fields
[ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated SOLR-1131: -- Attachment: SOLR-1131.patch OK, here's my take on this. I took Yonik's and merged it w/ a patch I had in the works. It's not done, but all tests pass, including the new on I added (PolyFieldTest). Yonik's move to put getFieldQuery in FieldType was just the key to answering the question of how to generate queries given a FieldType. Notes: 1. I changed the Geo examples to be CoordinateFieldType (representing an abstract coordinate system) and then PointFieldType which represents a point in an n-dimensional space (default 2D). I think from this, we could easily add things like PolygonFieldType, etc. which would allow us to create more sophisticated shapes and do things like intersections, etc. For instance, imagine saying: Does this point lie within this shape? I think that might be able to be expressed as a RangeQuery 2. I'm not sure I care for the name of the new abstract FieldType that is a base class of CoordinateFieldType called DelegatingFieldType 3. I'm not sure yet on the properties of the generated fields just yet. Right now, I'm delegating the handling to the sub FieldType except I'm overriding to turn off storage, which I think is pretty cool (could even work as a copy field like functionality) 4. I'm not thrilled about creating a SchemaField every time in the createFields protected helper method, but SchemaField is final and doesn't have a setName method (which makes sense) Questions for Yonik on his patch: 1. Why is TextField overriding getFieldQuery when it isn't called, except possibly via the FieldQParserPlugin? 2. I'm not sure I understand the getDistance, getBoundingBox methods on the GeoFieldType. It seems like that precludes one from picking a specific distance (for instance, some times you may want a faster approx. and others a slower) Needs: 1. Write up changes.txt 2. More tests, including performance testing 3. Patch doesn't support dynamic fields yet, but it should Allow a single field type to index multiple fields -- Key: SOLR-1131 URL: https://issues.apache.org/jira/browse/SOLR-1131 Project: Solr Issue Type: New Feature Components: Schema and Analysis Reporter: Ryan McKinley Assignee: Grant Ingersoll Fix For: 1.5 Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch In a few special cases, it makes sense for a single field (the concept) to be indexed as a set of Fields (lucene Field). Consider SOLR-773. The concept point may be best indexed in a variety of ways: * geohash (sincle lucene field) * lat field, lon field (two double fields) * cartesian tiers (a series of fields with tokens to say if it exists within that region) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1628) log contains incorrect number of adds and deletes
[ https://issues.apache.org/jira/browse/SOLR-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley resolved SOLR-1628. Resolution: Fixed committed fix. log contains incorrect number of adds and deletes - Key: SOLR-1628 URL: https://issues.apache.org/jira/browse/SOLR-1628 Project: Solr Issue Type: Bug Affects Versions: 1.4 Reporter: Yonik Seeley Fix For: 1.5 LogUpdateProcessorFactory logs the wrong number of deletes/adds if more than 8. http://search.lucidimagination.com/search/document/f75c6a5a58e205a4/minor_nit -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields
[ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786951#action_12786951 ] Chris A. Mattmann commented on SOLR-1131: - Patch is looking good! I'm pouring through it right now -- I'll try and test this as part of work I'm doing on SOLR-1586 -- maybe even update that issue if I get a sec today :) Allow a single field type to index multiple fields -- Key: SOLR-1131 URL: https://issues.apache.org/jira/browse/SOLR-1131 Project: Solr Issue Type: New Feature Components: Schema and Analysis Reporter: Ryan McKinley Assignee: Grant Ingersoll Fix For: 1.5 Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch In a few special cases, it makes sense for a single field (the concept) to be indexed as a set of Fields (lucene Field). Consider SOLR-773. The concept point may be best indexed in a variety of ways: * geohash (sincle lucene field) * lat field, lon field (two double fields) * cartesian tiers (a series of fields with tokens to say if it exists within that region) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [jira] Issue Comment Edited: (SOLR-236) Field collapsing
The request I am sending is: http://localhost:8983/solr/select/?q=aaaversion=2.2start=0rows=20indent=oncollapse.field=colcollapse.includeCollapsedDocs.fl=*collapse.type=adjacentcollapse.info.doc=truecollapse.info.count=true I search for 'aaa' in the content field. All the documents in the result contain that string in the field content Martijn v Groningen wrote: Yes it should look similar to that. What is the exact request you send to Solr? Also to check if the patch works correctly can you run: ant clean test There are a number of tests that test the Field collapse functionality. Martijn 2009/12/7 Marc Sturlese marc.sturl...@gmail.com: lst name=collapse_counts str name=fieldcat/str lst name=results lst name=009 str name=fieldValuehard/str int name=collapseCount1/int result name=collapsedDocs numFound=1 start=0 doc long name=id008/long str name=contentaaa aaa/str str name=colccc/str /doc /result /lst ... /lst /lst I see, looks like I am applying the patch wrongly somehow. This the complete collapse_counts response I am getting: lst name=collapse_counts str name=fieldcol/str lst name=results lst int name=collapseCount1/int int name=collapseCount1/int int name=collapseCount1/int str name=fieldValuebbb/str str name=fieldValueccc/str str name=fieldValuexxx/str result name=collapsedDocs numFound=1 start=0 doc long name=id2/long str name=contentaaa aaa/str str name=colbbb/str /doc /result result name=collapsedDocs numFound=1 start=0 doc long name=id8/long str name=contentaaa aaa aaa sd/str str name=colccc/str /doc /result result name=collapsedDocs numFound=4 start=0 doc long name=id12/long str name=contentaaa aaa aaa v/str str name=colxxx/str /doc /result /lst /lst /lst As you can see I am getting a lst tag with no name. As I understood what you told me. I should be getting as many lst tags as collapsed groups and the name attribute of the lst should be the unique field value. So, if the patch was applyed correcly teh response should look like: lst name=collapse_counts str name=fieldcol/str lst name=results lst name=354 (the head value of the collapsed group) int name=collapseCount1/int str name=fieldValuebbb/str result name=collapsedDocs numFound=1 start=0 doc long name=id2/long str name=contentaaa aaa/str str name=colbbb/str /doc /result /lst lst name=654 int name=collapseCount1/int str name=fieldValueccc/str result name=collapsedDocs numFound=1 start=0 doc long name=id8/long str name=contentaaa aaa aaa sd/str str name=colccc/str /doc /result /lst lst name=654 int name=collapseCount1/int str name=fieldValuexxx/str result name=collapsedDocs numFound=4 start=0 doc long name=id12/long str name=contentaaa aaa aaa v/str str name=colxxx/str /doc /result /lst /lst /lst Is this the way the response looks like when you use teh patch? Thanks in advance Martijn v Groningen wrote: Hi Marc, I'm not sure if I follow you completely, but the example you gave is not complete. I'm missing a few tags in your example. Lets assume the following response that the latest patches produce. lst name=collapse_counts str name=fieldcat/str lst name=results lst name=009 str name=fieldValuehard/str int name=collapseCount1/int result name=collapsedDocs numFound=1 start=0 doc long name=id008/long str name=contentaaa aaa/str str name=colccc/str /doc /result /lst ... /lst /lst The result list contains collapse groups. The name of the child elements are the collapse head ids. Everything that falls under the collapse head belongs to that collapse group and thus adding document head id to the field value is unnecessary. In the above example document with id 009 is the document head of document with id 008. Document with id 009 should be displayed in the search result. From what you have said, it seems that you properly configured the patch. Martijn 2009/12/7 Marc Sturlese marc.sturl...@gmail.com: Hey there, I have beeb testing the last patch and I think or I am missing something or the way to show the collapsed documents when adjacent collapse can be sometimes confusing: I am using the patch replacing queryComponent for collapseComponent (not using both at same
Re: [jira] Issue Comment Edited: (SOLR-236) Field collapsing
The last two parameters are not necessary, since they default both to true. Could you run the field collapse tests tests successful? 2009/12/7 Marc Sturlese marc.sturl...@gmail.com: The request I am sending is: http://localhost:8983/solr/select/?q=aaaversion=2.2start=0rows=20indent=oncollapse.field=colcollapse.includeCollapsedDocs.fl=*collapse.type=adjacentcollapse.info.doc=truecollapse.info.count=true I search for 'aaa' in the content field. All the documents in the result contain that string in the field content Martijn v Groningen wrote: Yes it should look similar to that. What is the exact request you send to Solr? Also to check if the patch works correctly can you run: ant clean test There are a number of tests that test the Field collapse functionality. Martijn 2009/12/7 Marc Sturlese marc.sturl...@gmail.com: lst name=collapse_counts str name=fieldcat/str lst name=results lst name=009 str name=fieldValuehard/str int name=collapseCount1/int result name=collapsedDocs numFound=1 start=0 doc long name=id008/long str name=contentaaa aaa/str str name=colccc/str /doc /result /lst ... /lst /lst I see, looks like I am applying the patch wrongly somehow. This the complete collapse_counts response I am getting: lst name=collapse_counts str name=fieldcol/str lst name=results lst int name=collapseCount1/int int name=collapseCount1/int int name=collapseCount1/int str name=fieldValuebbb/str str name=fieldValueccc/str str name=fieldValuexxx/str result name=collapsedDocs numFound=1 start=0 doc long name=id2/long str name=contentaaa aaa/str str name=colbbb/str /doc /result result name=collapsedDocs numFound=1 start=0 doc long name=id8/long str name=contentaaa aaa aaa sd/str str name=colccc/str /doc /result result name=collapsedDocs numFound=4 start=0 doc long name=id12/long str name=contentaaa aaa aaa v/str str name=colxxx/str /doc /result /lst /lst /lst As you can see I am getting a lst tag with no name. As I understood what you told me. I should be getting as many lst tags as collapsed groups and the name attribute of the lst should be the unique field value. So, if the patch was applyed correcly teh response should look like: lst name=collapse_counts str name=fieldcol/str lst name=results lst name=354 (the head value of the collapsed group) int name=collapseCount1/int str name=fieldValuebbb/str result name=collapsedDocs numFound=1 start=0 doc long name=id2/long str name=contentaaa aaa/str str name=colbbb/str /doc /result /lst lst name=654 int name=collapseCount1/int str name=fieldValueccc/str result name=collapsedDocs numFound=1 start=0 doc long name=id8/long str name=contentaaa aaa aaa sd/str str name=colccc/str /doc /result /lst lst name=654 int name=collapseCount1/int str name=fieldValuexxx/str result name=collapsedDocs numFound=4 start=0 doc long name=id12/long str name=contentaaa aaa aaa v/str str name=colxxx/str /doc /result /lst /lst /lst Is this the way the response looks like when you use teh patch? Thanks in advance Martijn v Groningen wrote: Hi Marc, I'm not sure if I follow you completely, but the example you gave is not complete. I'm missing a few tags in your example. Lets assume the following response that the latest patches produce. lst name=collapse_counts str name=fieldcat/str lst name=results lst name=009 str name=fieldValuehard/str int name=collapseCount1/int result name=collapsedDocs numFound=1 start=0 doc long name=id008/long str name=contentaaa aaa/str str name=colccc/str /doc /result /lst ... /lst /lst The result list contains collapse groups. The name of the child elements are the collapse head ids. Everything that falls under the collapse head belongs to that collapse group and thus adding document head id to the field value is unnecessary. In the above example document with id 009 is the document head of document with id 008. Document with id 009 should be displayed in the search result. From what you have said, it seems that you properly configured the patch. Martijn 2009/12/7 Marc Sturlese marc.sturl...@gmail.com: Hey there, I have beeb testing the last patch and I think or I am missing something or the
[jira] Updated: (SOLR-1621) Allow current single core deployments to be specified by solr.xml
[ https://issues.apache.org/jira/browse/SOLR-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-1621: - Attachment: SOLR-1621.patch the index pages are fixed Allow current single core deployments to be specified by solr.xml - Key: SOLR-1621 URL: https://issues.apache.org/jira/browse/SOLR-1621 Project: Solr Issue Type: New Feature Affects Versions: 1.5 Reporter: Noble Paul Fix For: 1.5 Attachments: SOLR-1621.patch, SOLR-1621.patch, SOLR-1621.patch, SOLR-1621.patch, SOLR-1621.patch supporting two different modes of deployments is turning out to be hard. This leads to duplication of code. Moreover there is a lot of confusion on where do we put common configuration. See the mail thread http://markmail.org/message/3m3rqvp2ckausjnf -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1553) extended dismax query parser
[ https://issues.apache.org/jira/browse/SOLR-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787021#action_12787021 ] Hoss Man commented on SOLR-1553: Thoughts while reading the code... * the code is kind of hard to read ... there's a serious dirth of comments * reads very kludgy, clearly a hacked up version of DisMax ... probably want to refactor some helper functions (that can then be documented) * the clause.field and getFieldName functionality is dangerous for people migrating from edismax-dismax (users guessing field names can query on fields the solr admin doesn't want them to query on) ... we need an option to turn that off. ** one really nice thing about the field query support though: it looks like it would really be easy to add support for arbitrary field name aliasing with something like f.someFieldAlias.qf=realFieldA^3+realFieldB^4 ** perhaps getFieldName should only work for fields explicitly enumerated in a param? * why is TO listed as an operator when building up the phrase boost fields? (line 296) ... if range queries are supported, then shouldn't the upper/lower bounds also be striped out of the clauses list? ** accepting range queries also seems like something that people should be able to disable * apparently pf was changed to iteratively build boosting phrase queries for every 'pair' of words, and pf3 is a new param to build boosting phrase queries for every 'triple' of words in the input. while this certainly seems useful, it's not back-compatable .. why not restore 'pf' to it's original purpose, and add pf2 for hte pairs? * what is the motivation for ExtendedSolrQueryParser.makeDismax? ... i see that the boost queries built from the pf and pf3 fields are put in BooleanQueries instead of DisjunctionMaxQueries ... but why? (if the user searches for a phrase that's common in many fields of one document, that document is going to get a huge score boost regardless of the tie value, which kind of defeats the point of what the dismax parser is trying to do) * we should remove the extremely legacy /* legacy logic */ for dealing with bq ... almost no one should care about that, we really don't need to carry it forward in a new parser. * there are a lot of empty catch blocks that seem like they should at least log a warning or debug message. * ExtendedAnalyzer feels like a really big hack ... i'm not certain, but i don't think it works correctly if a CharFilter is declared. * we need to document all these new params (pf3, lowercaseOperators, boost, Thoughts while testing it out on some really hairy edge cases that break the old dismax parser... * this is really cool * this is really freaking cool. * still has a problem with search strings like foo and foo || ... i suspect it would be an easy fix to recognize these just like AND/OR are recognized and escaped. * once we fix some of hte issues mentioned above, we should absolutely register this using the name dismax by default, and register the old one as oldDismax with a note in CHANGES.txt telling people to use defType=oldDismax if they really need it. extended dismax query parser Key: SOLR-1553 URL: https://issues.apache.org/jira/browse/SOLR-1553 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley Fix For: 1.5 Attachments: SOLR-1553.patch An improved user-facing query parser based on dismax -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1607) use a proper key other than IndexReader for ExternalFileField and QueryElevationCompenent to work properly when reopenReaders is set to true
[ https://issues.apache.org/jira/browse/SOLR-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787024#action_12787024 ] Hoss Man commented on SOLR-1607: I'm not too familiar with the internals of ExternalFileField and QueryElevationCompenent, but if the caching is already broken because of reopen, then now is probably a good time to try to gut their one-off caches and replace them with uses of SolrCache -- that way we can have regenerators for them to autowarm on newSearcher. (ExternalFileField would probably be pretty hard to make work like this however, because of the way schema.xml resources are isolated from the SolrCore) use a proper key other than IndexReader for ExternalFileField and QueryElevationCompenent to work properly when reopenReaders is set to true Key: SOLR-1607 URL: https://issues.apache.org/jira/browse/SOLR-1607 Project: Solr Issue Type: Bug Components: search Affects Versions: 1.4 Reporter: Koji Sekiguchi Assignee: Koji Sekiguchi Priority: Minor Fix For: 1.5 As introducing reopenReaders feature in 1.4, this prevent reload external_[fieldname] and elevate.xml files in dataDir when commit is submitted. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [jira] Issue Comment Edited: (SOLR-236) Field collapsing
Yes, I can reproduce the same situation here. I will update the patch asap and add it to Jira. Martijn 2009/12/7 Marc Sturlese marc.sturl...@gmail.com: Hey! Got it working! The problem was that my uniqueField is indexed as long and it's not suported by the patch. The value is obtained in getCollapseGroupResult function in AbstarctCollapseCollector.java as: String schemaId = searcher.doc(docId).get(uniqueIdFieldname); To suport long,int,slong,sint,float,sfloat... It should be obtaining doing somenthing like: FieldType idFieldType = searcher.getSchema().getFieldType(uniqueIdFieldname); String schemaId = ; Fieldable name_field = null; try { name_field = searcher.doc(id).getFieldable(uniqueIdFieldname); } catch (IOException ex) { //deal with exception } if (name_field != null) { schemaId = idFieldType.storedToReadable(name_field); } Martijn v Groningen wrote: The last two parameters are not necessary, since they default both to true. Could you run the field collapse tests tests successful? 2009/12/7 Marc Sturlese marc.sturl...@gmail.com: The request I am sending is: http://localhost:8983/solr/select/?q=aaaversion=2.2start=0rows=20indent=oncollapse.field=colcollapse.includeCollapsedDocs.fl=*collapse.type=adjacentcollapse.info.doc=truecollapse.info.count=true I search for 'aaa' in the content field. All the documents in the result contain that string in the field content Martijn v Groningen wrote: Yes it should look similar to that. What is the exact request you send to Solr? Also to check if the patch works correctly can you run: ant clean test There are a number of tests that test the Field collapse functionality. Martijn 2009/12/7 Marc Sturlese marc.sturl...@gmail.com: lst name=collapse_counts str name=fieldcat/str lst name=results lst name=009 str name=fieldValuehard/str int name=collapseCount1/int result name=collapsedDocs numFound=1 start=0 doc long name=id008/long str name=contentaaa aaa/str str name=colccc/str /doc /result /lst ... /lst /lst I see, looks like I am applying the patch wrongly somehow. This the complete collapse_counts response I am getting: lst name=collapse_counts str name=fieldcol/str lst name=results lst int name=collapseCount1/int int name=collapseCount1/int int name=collapseCount1/int str name=fieldValuebbb/str str name=fieldValueccc/str str name=fieldValuexxx/str result name=collapsedDocs numFound=1 start=0 doc long name=id2/long str name=contentaaa aaa/str str name=colbbb/str /doc /result result name=collapsedDocs numFound=1 start=0 doc long name=id8/long str name=contentaaa aaa aaa sd/str str name=colccc/str /doc /result result name=collapsedDocs numFound=4 start=0 doc long name=id12/long str name=contentaaa aaa aaa v/str str name=colxxx/str /doc /result /lst /lst /lst As you can see I am getting a lst tag with no name. As I understood what you told me. I should be getting as many lst tags as collapsed groups and the name attribute of the lst should be the unique field value. So, if the patch was applyed correcly teh response should look like: lst name=collapse_counts str name=fieldcol/str lst name=results lst name=354 (the head value of the collapsed group) int name=collapseCount1/int str name=fieldValuebbb/str result name=collapsedDocs numFound=1 start=0 doc long name=id2/long str name=contentaaa aaa/str str name=colbbb/str /doc /result /lst lst name=654 int name=collapseCount1/int str name=fieldValueccc/str result name=collapsedDocs numFound=1 start=0 doc long name=id8/long str name=contentaaa aaa aaa sd/str str name=colccc/str /doc /result /lst lst name=654 int name=collapseCount1/int str name=fieldValuexxx/str result name=collapsedDocs numFound=4 start=0 doc long name=id12/long str name=contentaaa aaa aaa v/str str name=colxxx/str /doc /result /lst /lst /lst Is this the way the response looks like when you use teh patch? Thanks in advance Martijn v Groningen wrote: Hi Marc, I'm not sure if I follow you completely, but the example you gave is not complete. I'm missing a few tags in your example. Lets assume the following response that the latest patches produce. lst name=collapse_counts str name=fieldcat/str lst name=results lst name=009 str name=fieldValuehard/str int name=collapseCount1/int result
[jira] Commented: (SOLR-1621) Allow current single core deployments to be specified by solr.xml
[ https://issues.apache.org/jira/browse/SOLR-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787031#action_12787031 ] Mark Miller commented on SOLR-1621: --- bq. the index pages are fixed how are they fixed? I think one problem with how you are handling alias' is that it only works with alias' defined in solr.xml? If you use the request handler to create an alias (with the alias command), I still don't think that works properly. Allow current single core deployments to be specified by solr.xml - Key: SOLR-1621 URL: https://issues.apache.org/jira/browse/SOLR-1621 Project: Solr Issue Type: New Feature Affects Versions: 1.5 Reporter: Noble Paul Fix For: 1.5 Attachments: SOLR-1621.patch, SOLR-1621.patch, SOLR-1621.patch, SOLR-1621.patch, SOLR-1621.patch supporting two different modes of deployments is turning out to be hard. This leads to duplication of code. Moreover there is a lot of confusion on where do we put common configuration. See the mail thread http://markmail.org/message/3m3rqvp2ckausjnf -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1624) Highlighter bug with MultiValued field + TermPositions optimization
[ https://issues.apache.org/jira/browse/SOLR-1624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley resolved SOLR-1624. Resolution: Fixed Fix Version/s: 1.5 Committed. Thanks Chris! Highlighter bug with MultiValued field + TermPositions optimization --- Key: SOLR-1624 URL: https://issues.apache.org/jira/browse/SOLR-1624 Project: Solr Issue Type: Bug Components: highlighter Affects Versions: 1.4 Reporter: Chris Harris Fix For: 1.5 Attachments: SOLR-1624.patch When TermPositions are stored, then DefaultSolrHighlighter.doHighlighting(DocList docs, Query query, SolrQueryRequest req, String[] defaultFields) currently initializes tstream only for the first value of a multi-valued field. (Subsequent times through the loop reinitialization is preempted by tots being non-null.) This means that the 2nd/3rd/etc. values are not considered for highlighting purposes, resulting in missed highlights. I'm attaching a patch with a test case to demonstrate the problem (testTermVecMultiValuedHighlight2), as well as a proposed fix. All highlighter tests pass with this applied. The patch should apply cleanly against the latest trunk. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-139) Support updateable/modifiable documents
[ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787072#action_12787072 ] Mark Diggory commented on SOLR-139: --- I notice this is a very long lived issue and that it is marked for 1.5. Are there outstanding issues or problems with its usage if I apply it to my 1.4 source? Support updateable/modifiable documents --- Key: SOLR-139 URL: https://issues.apache.org/jira/browse/SOLR-139 Project: Solr Issue Type: New Feature Components: update Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.5 Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch It would be nice to be able to update some fields on a document without having to insert the entire document. Given the way lucene is structured, (for now) one can only modify stored fields. While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers. for background, see: http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1608) Make it easy to write distributed search test cases
[ https://issues.apache.org/jira/browse/SOLR-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar updated SOLR-1608: Attachment: SOLR-1608.patch Removed an extra log statement I had added for debugging. I'll commit this shortly. Make it easy to write distributed search test cases --- Key: SOLR-1608 URL: https://issues.apache.org/jira/browse/SOLR-1608 Project: Solr Issue Type: Improvement Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Priority: Minor Fix For: 1.5 Attachments: SOLR-1608.patch, SOLR-1608.patch, SOLR-1608.patch Extract base class from TestDistributedSearch to make it easier for people to write test cases for distributed components. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1608) Make it easy to write distributed search test cases
[ https://issues.apache.org/jira/browse/SOLR-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar resolved SOLR-1608. - Resolution: Fixed Committed revision 888115. Make it easy to write distributed search test cases --- Key: SOLR-1608 URL: https://issues.apache.org/jira/browse/SOLR-1608 Project: Solr Issue Type: Improvement Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Priority: Minor Fix For: 1.5 Attachments: SOLR-1608.patch, SOLR-1608.patch, SOLR-1608.patch Extract base class from TestDistributedSearch to make it easier for people to write test cases for distributed components. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-785) Distributed SpellCheckComponent
[ https://issues.apache.org/jira/browse/SOLR-785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar updated SOLR-785: --- Attachment: SOLR-785.patch Updating for SOLR-1608 commit. Distributed SpellCheckComponent --- Key: SOLR-785 URL: https://issues.apache.org/jira/browse/SOLR-785 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: SOLR-785.patch, SOLR-785.patch, SOLR-785.patch, SOLR-785.patch, spelling-shard.patch Enhance the SpellCheckComponent to run in a distributed (sharded) environment. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (SOLR-1629) Return to admin page link on registry.jsp goes to wrong page
[ https://issues.apache.org/jira/browse/SOLR-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar reassigned SOLR-1629: --- Assignee: Shalin Shekhar Mangar Return to admin page link on registry.jsp goes to wrong page -- Key: SOLR-1629 URL: https://issues.apache.org/jira/browse/SOLR-1629 Project: Solr Issue Type: Bug Components: web gui Reporter: Michael Ryan Assignee: Shalin Shekhar Mangar Priority: Minor The Return to admin page link on admin/registry.jsp links to the current page. http://svn.apache.org/viewvc/lucene/solr/trunk/src/webapp/web/admin/registry.xsl?revision=815587view=markup Change a href=Return to Admin Page/a to a href=.Return to Admin Page/a. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (SOLR-1620) log message created null misleading
[ https://issues.apache.org/jira/browse/SOLR-1620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar reassigned SOLR-1620: --- Assignee: Shalin Shekhar Mangar log message created null misleading - Key: SOLR-1620 URL: https://issues.apache.org/jira/browse/SOLR-1620 Project: Solr Issue Type: Bug Affects Versions: 1.4 Reporter: KuroSaka TeruHiko Assignee: Shalin Shekhar Mangar Priority: Minor Original Estimate: 0.5h Remaining Estimate: 0.5h Solr logs a message like this: {noformat} INFO: created null: org.apache.solr.analysis.LowerCaseFilterFactory {noformat} This sounds like the TokenFilter or Tokenizer were not created and a serious error. But it mealy means the component is not named. null is printed because the local variable name has the value null. This is misleading. If the text field type is not named, it should just print blank, rather than the word null. I would suggest that a line in src/java/org/apache/solr/util/plugin/AbstractPluginLoader.java be changed to: {noformat} log.info(created+((name!=null)?( +name):)+: + plugin.getClass().getName() ); {noformat} from {noformat} log.info(created +name+: + plugin.getClass().getName() ); {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1616) JSON Response for Facets not properly formatted
[ https://issues.apache.org/jira/browse/SOLR-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar resolved SOLR-1616. - Resolution: Won't Fix Closing this per Yonik's comment. JSON Response for Facets not properly formatted --- Key: SOLR-1616 URL: https://issues.apache.org/jira/browse/SOLR-1616 Project: Solr Issue Type: Bug Affects Versions: 1.4 Reporter: Lou Sacco When making a SOLR search call with facets turned on, I notice that the facets JSON string is not properly formatted using wt=json. I would expect that there would be a bracketed array around each record rather than running them all together. This is very hard to read with ExtJS as its JsonReader reads each element as its own record when the paired records are meant to be together. Here's an example of the output I get: {code} facet_counts:{ facet_queries:{}, facet_fields:{ deviceName:[ x2,6, dd22,12, f12,1], devicePrgMgr:[ alberto,80, anando,24, artus,101], portfolioName:[ zztop,32], chipsetName:[ fat,3, thin,2], {code} As an example, I would expect chipset family to be so that the JsonReader can read each as record: {code} chipsetName:[ [fat,3], [thin,2] ], {code} See [here|http://json.org/fatfree.html] for details on Json Arrays. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Best practices in Solr Schema Design
Hello everybody, I am trying to port data from a PostGRESql database instance onto Solr , I am getting flummoxed in coming up with an efficient schema design . Could somebody give me general pointers in how I can go about achieving this ? With Regards Sri -- View this message in context: http://old.nabble.com/Best-practices-in-Solr-Schema-Design-tp26684128p26684128.html Sent from the Solr - Dev mailing list archive at Nabble.com.
Best practices in Solr Schema Design
Hello everybody, I am trying to port data from a PostGRESql database instance onto Solr , I am getting flummoxed in coming up with an efficient schema design . Could somebody give me general pointers in how I can go about achieving this ? With Regards Sri -- View this message in context: http://old.nabble.com/Best-practices-in-Solr-Schema-Design-tp26684130p26684130.html Sent from the Solr - Dev mailing list archive at Nabble.com.
[jira] Resolved: (SOLR-343) Constraining date facets by facet.mincount
[ https://issues.apache.org/jira/browse/SOLR-343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-343. --- Resolution: Fixed Fix Version/s: 1.5 Assignee: Hoss Man Patch looks good, test looks good. thanks guys! Constraining date facets by facet.mincount -- Key: SOLR-343 URL: https://issues.apache.org/jira/browse/SOLR-343 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.2 Environment: Solr 1.2+ Reporter: Raiko Eckstein Assignee: Hoss Man Priority: Minor Fix For: 1.5 Attachments: DateFacetsMincountPatch.patch, SOLR-343.patch It would be helpful to allow the facet.mincount parameter to work with date facets, i.e. constraining the results so that it would be possible to filter out date ranges in the results where no documents occur from the server-side. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Build Solr index using Hadoop MapReduce
Build Solr index using Hadoop MapReduce http://issues.apache.org/jira/browse/SOLR-1045 Ning Li-3 wrote: SOLR-1045 it is. More details will be available in that issue. Marc, you can check out Hadoop contrib/index which builds a Lucene index using Hadoop MapReduce. However, it does not handle duplicate detection. Cheers, Ning On Mon, Mar 2, 2009 at 4:25 PM, Marc Sturlese marc.sturl...@gmail.com wrote: I am doing some research about creating lucene/solr index using hadoop but there's not so much info around, would be great to see some code!!! (I am experiencing problems specially in duplication detection) Thanks Shalin Shekhar Mangar wrote: On Mon, Mar 2, 2009 at 11:24 PM, Ning Li ning.li...@gmail.com wrote: Hi, I wonder if there is interest in a contrib module that builds Solr index using Hadoop MapReduce? Absolutely! It is different from the Solr support in Nutch. The Solr support in Nutch sends a document to a Solr server in a reduce task. Here, I aim at building/updating Solr index within map/reduce tasks. Also, it achieves better parallelism when the number of map tasks is greater than the number of reduce tasks, which is usually the case. I worked out a very simple initial version. But I want to check if there is any interest before proceeding. If so, I'll open a Jira issue. +1 Please do. It'd be great to see this in Solr. -- Regards, Shalin Shekhar Mangar. -- View this message in context: http://www.nabble.com/Build-Solr-index-using-Hadoop-MapReduce-tp22293172p22296832.html Sent from the Solr - Dev mailing list archive at Nabble.com. -- View this message in context: http://old.nabble.com/Build-Solr-index-using-Hadoop-MapReduce-tp22293172p26684154.html Sent from the Solr - Dev mailing list archive at Nabble.com.
[jira] Commented: (SOLR-1586) Create Spatial Point FieldTypes
[ https://issues.apache.org/jira/browse/SOLR-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787113#action_12787113 ] Grant Ingersoll commented on SOLR-1586: --- FYI, see the SOLR-1131 for an implementation of a Point Field Type. Create Spatial Point FieldTypes --- Key: SOLR-1586 URL: https://issues.apache.org/jira/browse/SOLR-1586 Project: Solr Issue Type: Improvement Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Fix For: 1.5 Attachments: examplegeopointdoc.patch.txt, SOLR-1586.Mattmann.112209.geopointonly.patch.txt, SOLR-1586.Mattmann.112209.geopointonly.patch.txt, SOLR-1586.Mattmann.112409.geopointandgeohash.patch.txt, SOLR-1586.Mattmann.112409.geopointandgeohash.patch.txt, SOLR-1586.Mattmann.112509.geopointandgeohash.patch.txt Per SOLR-773, create field types that hid the details of creating tiers, geohash and lat/lon fields. Fields should take in lat/lon points in a single form, as in: field name=foolat lon/field -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1586) Create Spatial Point FieldTypes
[ https://issues.apache.org/jira/browse/SOLR-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787115#action_12787115 ] Grant Ingersoll commented on SOLR-1586: --- bq. we should have the ability to output those fields as georss per ryan's suggestion Ryan can correct me if I am putting words in his mouth, but I don't think he literally meant we needed to use those exact tags. I think he just meant the format of the actual values. Create Spatial Point FieldTypes --- Key: SOLR-1586 URL: https://issues.apache.org/jira/browse/SOLR-1586 Project: Solr Issue Type: Improvement Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Fix For: 1.5 Attachments: examplegeopointdoc.patch.txt, SOLR-1586.Mattmann.112209.geopointonly.patch.txt, SOLR-1586.Mattmann.112209.geopointonly.patch.txt, SOLR-1586.Mattmann.112409.geopointandgeohash.patch.txt, SOLR-1586.Mattmann.112409.geopointandgeohash.patch.txt, SOLR-1586.Mattmann.112509.geopointandgeohash.patch.txt Per SOLR-773, create field types that hid the details of creating tiers, geohash and lat/lon fields. Fields should take in lat/lon points in a single form, as in: field name=foolat lon/field -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1586) Create Spatial Point FieldTypes
[ https://issues.apache.org/jira/browse/SOLR-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787121#action_12787121 ] Chris A. Mattmann commented on SOLR-1586: - Hey Grant: bq. Ryan can correct me if I am putting words in his mouth, but I don't think he literally meant we needed to use those exact tags. I think he just meant the format of the actual values. Ah no worries -- I think it would be a nice feature to actual output using those exact tags. That's the point of a standard, right? With the tags comes namespacing and all that good stuff, which I believe to be important. Also, since XmlWriter is even more flexible per SOLR-1592, then I see no reason not to use those tags in the output? Cheers, Chris Create Spatial Point FieldTypes --- Key: SOLR-1586 URL: https://issues.apache.org/jira/browse/SOLR-1586 Project: Solr Issue Type: Improvement Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Fix For: 1.5 Attachments: examplegeopointdoc.patch.txt, SOLR-1586.Mattmann.112209.geopointonly.patch.txt, SOLR-1586.Mattmann.112209.geopointonly.patch.txt, SOLR-1586.Mattmann.112409.geopointandgeohash.patch.txt, SOLR-1586.Mattmann.112409.geopointandgeohash.patch.txt, SOLR-1586.Mattmann.112509.geopointandgeohash.patch.txt Per SOLR-773, create field types that hid the details of creating tiers, geohash and lat/lon fields. Fields should take in lat/lon points in a single form, as in: field name=foolat lon/field -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1586) Create Spatial Point FieldTypes
[ https://issues.apache.org/jira/browse/SOLR-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787150#action_12787150 ] Chris A. Mattmann commented on SOLR-1586: - bq. FYI, see the SOLR-1131 for an implementation of a Point Field Type. Sure, I'll take a look @ it and try to bring this patch up to speed w.r.t to that. Independently though, the geohash implementation i put up should be good to go right now. Please take a look and let me know if you are +1 to commit. I included an example doc to test it out with. Cheers, Chris Create Spatial Point FieldTypes --- Key: SOLR-1586 URL: https://issues.apache.org/jira/browse/SOLR-1586 Project: Solr Issue Type: Improvement Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Fix For: 1.5 Attachments: examplegeopointdoc.patch.txt, SOLR-1586.Mattmann.112209.geopointonly.patch.txt, SOLR-1586.Mattmann.112209.geopointonly.patch.txt, SOLR-1586.Mattmann.112409.geopointandgeohash.patch.txt, SOLR-1586.Mattmann.112409.geopointandgeohash.patch.txt, SOLR-1586.Mattmann.112509.geopointandgeohash.patch.txt Per SOLR-773, create field types that hid the details of creating tiers, geohash and lat/lon fields. Fields should take in lat/lon points in a single form, as in: field name=foolat lon/field -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-433) MultiCore and SpellChecker replication
[ https://issues.apache.org/jira/browse/SOLR-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787155#action_12787155 ] Jason Rutherglen commented on SOLR-433: --- Are the existing patches for multiple cores or only for spellchecking? MultiCore and SpellChecker replication -- Key: SOLR-433 URL: https://issues.apache.org/jira/browse/SOLR-433 Project: Solr Issue Type: Improvement Components: replication (scripts), spellchecker Affects Versions: 1.3 Reporter: Otis Gospodnetic Fix For: 1.5 Attachments: RunExecutableListener.patch, SOLR-433-r698590.patch, SOLR-433.patch, SOLR-433.patch, SOLR-433.patch, SOLR-433.patch, solr-433.patch, SOLR-433_unified.patch, spellindexfix.patch With MultiCore functionality coming along, it looks like we'll need to be able to: A) snapshot each core's index directory, and B) replicate any and all cores' complete data directories, not just their index directories. Pulled from the spellchecker and multi-core index replication thread - http://markmail.org/message/pj2rjzegifd6zm7m Otis: I think that makes sense - distribute everything for a given core, not just its index. And the spellchecker could then also have its data dir (and only index/ underneath really) and be replicated in the same fashion. Right? Ryan: Yes, that was my thought. If an arbitrary directory could be distributed, then you could have /path/to/dist/index/... /path/to/dist/spelling-index/... /path/to/dist/foo and that would all get put into a snapshot. This would also let you put multiple cores within a single distribution: /path/to/dist/core0/index/... /path/to/dist/core0/spelling-index/... /path/to/dist/core0/foo /path/to/dist/core1/index/... /path/to/dist/core1/spelling-index/... /path/to/dist/core1/foo -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
RE: SOLR-1131 - Multiple Fields per Field Type
: fieldType name=latlon type=LatLonFieldType pattern=location__* / : fieldType name=latlon_home type=LatLonFieldType pattern=location_home_*/ : fieldType name=latlon_work type=LatLonFieldType pattern=location_home_*/ : : field name=location type=latlon/ : field name=location_home type=latlon_home/ : field name=location_work type=latlon_work/ I'm not really understanding the value of an approach like that. for starters, what Lucene field names would ultimately be created in those examples? And if i also added... field name=other_location type=latlon/ dynamicField name=*_dynamic_location type=latlon/ ...then what field names would be created under the covers? : I think it makes more sense to define the heterogeneity at the fieldType level because: : : (a) it's a bit more consistent with the existing solr schema examples, : where the difference between many of the field types (e.g., ints and : tints, which are both solr.TrieIntField's, date and tdate, both : instances of solr.TrieDateField, with different configuration, etc.) : : (b) isolation of change: fieldType defs will change less often than : field defs, where names and indexed/stored/etc. debugging are likely : to occur more frequently ...this just feels wrong to me ... i can't really explain why. It seems like you are suggesting thatt every field/ declaration would need a one to one corrispondence with a unique fieldType/ declaration in order to prevent field name collisions, which sounds sketchy enough ... but i'm also not fond of the idea that a person editing the schema can't just look at the field/ and dynamicField/ names to ensure that they understand what underlying fields are being created (so they don't inadvertantly add a new one that collides) ... now they also have to look at the pattern attribute of every fieldType/ that is a poly field. letting dynamicField/ drive everything just seems a *lot* simpler ... both as far as implementation, and as far as maintaining the schema. : I don't think the above hybrid approach will lead to anything other than : confusion, as you indicated above. Let's stick to the pattern defs at : the fieldType level, and then let the fieldType handle the internal : dynamicity with e.g., a dynamicField, and then notify the schema user From the standpoint of reading a schema.xml file, the approach you're describing of a pattern attribute on fieldType/ declarations actaully seems more confusing then the strawman suggestion i made of a pattern attribute on field ... even without understanding what concrete feilds you are suggesting would be created with a configuration like that, it still increases the number of places you have to look to see what field names are getting created. -Hoss
Re: SOLR-1131 - Multiple Fields per Field Type
: I'm not sure if you worry about it. But I'd argue it isn't natural : anyway. You would do the following instead, which is how any address : book I've ever seen works: : field name=home type=LatLonFT/ : field name=work type=LatLonFT/ ...the home vs work distinction was arbitrary. the point is what if i want to support an arbitrary number of distinct values in a PolyField? ... with your approach any attempt to search for people near X would require me to search for work near X or home near X ... which is analogous to oneof hte main purposes of multivalued fields: so i don't have to uniquely name every Field instance. I might have a thousand unique (but unamed) locations that i want to associate with a document, and i want to search for documents with a location near X ... likewise i might have thousands unique polygons associated with a document and i want to search for documents where one or more polygons overlap with an input polygon (ie: island nations overlapping with the flight path of an airplane). The question is: how can/would PolyFields deal with input like this? .. we've discussed cardniality in the number of fields produced by a single input value, but we haven't really discussed cardinality in the number of input values. : So, maybe the FT can explicitly prohibit multivalued? But, I suppose : you could do the position thing, too. This could be achieved through a : new SpanQuery pretty easily: SpanPositionQuery that takes in a term and : a specific position. Trivial to write, I think, just not sure if it is : generally useful. Although, I must say I've been noodling around with The problem is how do you let the PolyField specify the position when indexing? the last API i saw fleshed out in this discussion didn't give the PolyField any information about how many input values were in any given doc, it just allowed PolyFields to be String=Field[] black boxes (as opposed to the String=Field[] black box FieldTYpes must currently be). We can't assume even basic lastPostion+1 type logic for these polyfields, because differnet input values might produce Filed arrays containing different quantities of fields, with differnet names. if a CartiesienPolyFieldType can get away with only using the grid_level1 and grid_level2 fields for one input value, but other input values require using grid_level2, grid_level2, and grid_level3, then simple position increments aren't enough if a document has multiple values (some of which need 2 different Field names, and others that need 3) -Hoss
Re: SOLR-1131 - Multiple Fields per Field Type
On Dec 7, 2009, at 5:59 PM, Chris Hostetter wrote: : fieldType name=latlon type=LatLonFieldType pattern=location__* / : fieldType name=latlon_home type=LatLonFieldType pattern=location_home_*/ : fieldType name=latlon_work type=LatLonFieldType pattern=location_home_*/ : : field name=location type=latlon/ : field name=location_home type=latlon_home/ : field name=location_work type=latlon_work/ I'm not really understanding the value of an approach like that. for starters, what Lucene field names would ultimately be created in those examples? And if i also added... Have a look at the patch I put up today. I think it is going to work quite well, but that could be jet-lag induced delirium at this point. For a field type: fieldType name=point type=solr.PointType dimension=2 subFieldType=double/ and a field declared as: field name=home type=point indexed=true stored=true/ And a new document of: doc field name=point39.0 -79.434/field /doc There are three fields created: home -- Contains the stored value home___0 - Contains 39.0 indexed as a double (as in the double FieldType, not just a double precision) home___1 - Contains -79.434 as a double field name=other_location type=latlon/ dynamicField name=*_dynamic_location type=latlon/ ...then what field names would be created under the covers? : I think it makes more sense to define the heterogeneity at the fieldType level because: : : (a) it's a bit more consistent with the existing solr schema examples, : where the difference between many of the field types (e.g., ints and : tints, which are both solr.TrieIntField's, date and tdate, both : instances of solr.TrieDateField, with different configuration, etc.) : : (b) isolation of change: fieldType defs will change less often than : field defs, where names and indexed/stored/etc. debugging are likely : to occur more frequently ...this just feels wrong to me ... i can't really explain why. It seems like you are suggesting thatt every field/ declaration would need a one to one corrispondence with a unique fieldType/ declaration in order to prevent field name collisions, which sounds sketchy enough ... but i'm also not fond of the idea that a person editing the schema can't just look at the field/ and dynamicField/ names to ensure that they understand what underlying fields are being created (so they don't inadvertantly add a new one that collides) ... now they also have to look at the pattern attribute of every fieldType/ that is a poly field. letting dynamicField/ drive everything just seems a *lot* simpler ... both as far as implementation, and as far as maintaining the schema. I don't agree. It requires more configuration and more knowledge by the end user and doesn't hid the details.
Re: SOLR-1131 - Multiple Fields per Field Type
On Dec 7, 2009, at 6:13 PM, Chris Hostetter wrote: : I'm not sure if you worry about it. But I'd argue it isn't natural : anyway. You would do the following instead, which is how any address : book I've ever seen works: : field name=home type=LatLonFT/ : field name=work type=LatLonFT/ ...the home vs work distinction was arbitrary. the point is what if i want to support an arbitrary number of distinct values in a PolyField? This is the beauty of Yonik's addition of getFieldQuery() to the FieldType. The FieldType will be aware of the arbitrariness. Furthermore, it can reflect on the index itself via IndexReader.getFieldNames() to determine the number of Fields that actually exist if it has to. However, my guess is that in practice in most situations the FieldType author/user will have the info it needs. Still, I think we can also evolve if we need to. ... with your approach any attempt to search for people near X would require me to search for work near X or home near X ... which is analogous to oneof hte main purposes of multivalued fields: so i don't have to uniquely name every Field instance. Sure, but would you really ever model multiple locations like that in the same field? I don't think in practice that you would, so I think it is a bit of a red herring. Perhaps there is a different use case that better demonstrates it? I might have a thousand unique (but unamed) locations that i want to associate with a document, and i want to search for documents with a location near X ... likewise i might have thousands unique polygons associated with a document and i want to search for documents where one or more polygons overlap with an input polygon (ie: island nations overlapping with the flight path of an airplane). I don't think this implementation precludes that. The FunctionQueries only operating on a single valued field does, however. Setting that aside, we could write a Query that does what you want, I think. The question is: how can/would PolyFields deal with input like this? .. we've discussed cardniality in the number of fields produced by a single input value, but we haven't really discussed cardinality in the number of input values. I'm not sure that it does, but I don't know that it needs to just yet. This might be where an R-Tree implementation comes in handy, but I'll leave it to the geo-experts to discuss more. I also am not sure how the PolyField case is any different than the dynamic field case. Either way, something needs to know the names of the fields that were created. : So, maybe the FT can explicitly prohibit multivalued? But, I suppose : you could do the position thing, too. This could be achieved through a : new SpanQuery pretty easily: SpanPositionQuery that takes in a term and : a specific position. Trivial to write, I think, just not sure if it is : generally useful. Although, I must say I've been noodling around with The problem is how do you let the PolyField specify the position when indexing? the last API i saw fleshed out in this discussion didn't give the PolyField any information about how many input values were in any given doc, it just allowed PolyFields to be String=Field[] black boxes (as opposed to the String=Field[] black box FieldTYpes must currently be). We can't assume even basic lastPostion+1 type logic for these polyfields, because differnet input values might produce Filed arrays containing different quantities of fields, with differnet names. if a CartiesienPolyFieldType can get away with only using the grid_level1 and grid_level2 fields for one input value, but other input values require using grid_level2, grid_level2, and grid_level3, then simple position increments aren't enough if a document has multiple values (some of which need 2 different Field names, and others that need 3) That's not how the Cartesian Field stuff works, but I think I see what you are getting at and I would say I'm going to explicitly punt on that right now. Ultimately, I think when such a case comes up, the FieldType needs to be configured to be able to determine this information. -Grant
Re: Solr Cell revamped as an UpdateProcessor?
On Dec 7, 2009, at 3:51 PM, Chris Hostetter wrote: ASs someone with very little knowledge of Solr Cell and/or Tika, I find myself wondering if ExtractingRequestHandler would make more sense as an extractingUpdateProcessor -- where it could be configured to take take either binary fields (or string fields containing URLs) out of the Documents, parse them with tika, and add the various XPath matching hunks of text back into the document as new fields. Then ExtractingRequestHandler just becomes a handler that slurps up it's ContentStreams and adds them as binary data fields and adds the other literal params as fields. Wouldn't that make things like SOLR-1358, and using Tika with URLs/filepaths in XML and CSV based updates fairly trivial? It probably could, but am not sure how it works in a processor chain. However, I'm not sure I understand how they work all that much either. I also plan on adding, BTW, a SolrJ client for Tika that does the extraction on the client. In many cases, the ExtrReqHandler is really only designed for lighter weight extraction cases, as one would simply not want to send that much rich content over the wire.
Inconsistent Search Results for different flavors of same search term
Hello, I was performing a search on different versions of the term San Jose on my Solr Instance , the differing versions being : san jose(all lowercase) San jose(One uppercase) San Jose (Capital first letters) SAN JOSE (ALL Caps) each of these phrases return a different number of hits back as response objects . for example san jose returns - result name=response numFound=0 start=0 San jose returns -result name=response numFound=4 start=0 San Jose returns -result name=response numFound=16 start=0 SAN JOSE returns - result name=response numFound=853 start=0 How do I make my search not case sensitive? -- View this message in context: http://old.nabble.com/Inconsistent-Search-Results-for-different-flavors-of-same-search-term-tp26686294p26686294.html Sent from the Solr - Dev mailing list archive at Nabble.com.
[jira] Commented: (SOLR-1586) Create Spatial Point FieldTypes
[ https://issues.apache.org/jira/browse/SOLR-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787198#action_12787198 ] Grant Ingersoll commented on SOLR-1586: --- Can you put a patch containing just the geohash stuff? Create Spatial Point FieldTypes --- Key: SOLR-1586 URL: https://issues.apache.org/jira/browse/SOLR-1586 Project: Solr Issue Type: Improvement Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Fix For: 1.5 Attachments: examplegeopointdoc.patch.txt, SOLR-1586.Mattmann.112209.geopointonly.patch.txt, SOLR-1586.Mattmann.112209.geopointonly.patch.txt, SOLR-1586.Mattmann.112409.geopointandgeohash.patch.txt, SOLR-1586.Mattmann.112409.geopointandgeohash.patch.txt, SOLR-1586.Mattmann.112509.geopointandgeohash.patch.txt Per SOLR-773, create field types that hid the details of creating tiers, geohash and lat/lon fields. Fields should take in lat/lon points in a single form, as in: field name=foolat lon/field -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1277) Implement a Solr specific naming service (using Zookeeper)
[ https://issues.apache.org/jira/browse/SOLR-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-1277: -- Attachment: SOLR-1277.patch Inching forward as we try and nail down the layout. * moves the configs to /solr/configs/collection1 in the tests * which config to load is discovered from /solr/collections/collection1/config=collection1 * system property for the name of the collection to work with * consolidated zookeeper host and solr path sys properties into one ie localhost:2181/solr I still expect everything in this patch to be very fluid and change as we move forward - but its something to give us a base to play with. We should probably start a ZooKeeper branch since this issue is likely to get quite large and hopefully have many contributors - that model has worked quite well with the flexible indexing issue in Lucene, and I have gotten quite handy at quick merging from my practice there ;) Implement a Solr specific naming service (using Zookeeper) -- Key: SOLR-1277 URL: https://issues.apache.org/jira/browse/SOLR-1277 Project: Solr Issue Type: New Feature Affects Versions: 1.4 Reporter: Jason Rutherglen Assignee: Grant Ingersoll Priority: Minor Fix For: 1.5 Attachments: log4j-1.2.15.jar, SOLR-1277.patch, SOLR-1277.patch, SOLR-1277.patch, SOLR-1277.patch, zookeeper-3.2.1.jar Original Estimate: 672h Remaining Estimate: 672h The goal is to give Solr server clusters self-healing attributes where if a server fails, indexing and searching don't stop and all of the partitions remain searchable. For configuration, the ability to centrally deploy a new configuration without servers going offline. We can start with basic failover and start from there? Features: * Automatic failover (i.e. when a server fails, clients stop trying to index to or search it) * Centralized configuration management (i.e. new solrconfig.xml or schema.xml propagates to a live Solr cluster) * Optionally allow shards of a partition to be moved to another server (i.e. if a server gets hot, move the hot segments out to cooler servers). Ideally we'd have a way to detect hot segments and move them seamlessly. With NRT this becomes somewhat more difficult but not impossible? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1606) Integrate Near Realtime
[ https://issues.apache.org/jira/browse/SOLR-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787206#action_12787206 ] Jason Rutherglen commented on SOLR-1606: Koji, Looks like a change to trunk is causing the error, also when I step through it passes, when I run without stepping it fails... Integrate Near Realtime Key: SOLR-1606 URL: https://issues.apache.org/jira/browse/SOLR-1606 Project: Solr Issue Type: Improvement Components: update Affects Versions: 1.4 Reporter: Jason Rutherglen Priority: Minor Fix For: 1.5 Attachments: SOLR-1606.patch We'll integrate IndexWriter.getReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1606) Integrate Near Realtime
[ https://issues.apache.org/jira/browse/SOLR-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787208#action_12787208 ] Mark Miller commented on SOLR-1606: --- Don't we need a new command, like update_realtime (bad name i know) or something? Else you will be doing a full commit every time you get the new reader? Integrate Near Realtime Key: SOLR-1606 URL: https://issues.apache.org/jira/browse/SOLR-1606 Project: Solr Issue Type: Improvement Components: update Affects Versions: 1.4 Reporter: Jason Rutherglen Priority: Minor Fix For: 1.5 Attachments: SOLR-1606.patch We'll integrate IndexWriter.getReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-1606) Integrate Near Realtime
[ https://issues.apache.org/jira/browse/SOLR-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787208#action_12787208 ] Mark Miller edited comment on SOLR-1606 at 12/8/09 12:13 AM: - Don't we need a new command, like update_realtime (bad name i know) or something? Else you will be doing a full commit every time you get the new reader? *edit* I see - you skip the commit - I think we should make a new command though shouldn't we? Still allow a standard commit, but a new command that kicks in the realtime refresh? was (Author: markrmil...@gmail.com): Don't we need a new command, like update_realtime (bad name i know) or something? Else you will be doing a full commit every time you get the new reader? Integrate Near Realtime Key: SOLR-1606 URL: https://issues.apache.org/jira/browse/SOLR-1606 Project: Solr Issue Type: Improvement Components: update Affects Versions: 1.4 Reporter: Jason Rutherglen Priority: Minor Fix For: 1.5 Attachments: SOLR-1606.patch We'll integrate IndexWriter.getReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1631) NPE's reported from QueryComponent.mergeIds
NPE's reported from QueryComponent.mergeIds --- Key: SOLR-1631 URL: https://issues.apache.org/jira/browse/SOLR-1631 Project: Solr Issue Type: Bug Components: search Reporter: Hoss Man Multiple reports of QueryComponent.mergeIds occasionally throwing NPE... http://markmail.org/message/aqzaaphbuow4sa5o http://old.nabble.com/NullPointerException-thrown-during-updates-to-index-to26613309.html#a26613309 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1606) Integrate Near Realtime
[ https://issues.apache.org/jira/browse/SOLR-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787221#action_12787221 ] Jason Rutherglen commented on SOLR-1606: bq. Don't we need a new command, like update_realtime We could however it'd work the same as commit? Meaning afterwards, all pending changes (including deletes) are available? The commit command is fairly overloaded as is. Are you thinking in terms of replication? Integrate Near Realtime Key: SOLR-1606 URL: https://issues.apache.org/jira/browse/SOLR-1606 Project: Solr Issue Type: Improvement Components: update Affects Versions: 1.4 Reporter: Jason Rutherglen Priority: Minor Fix For: 1.5 Attachments: SOLR-1606.patch We'll integrate IndexWriter.getReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Inconsistent Search Results for different flavors of same search term
First, this is the devloper's list, I think this question would be better suited to the user's list. You get searches to be case insensitive by indexing and searching with an analyzer that, say, lowercases. If you post on the user's list, please include the analyzer definitions for the fields in question *and* your query. From your email, I can't tell if, for instance, you're even searching against the same field for both terms. i.e. if you're searching something like title:san jose then san would go against the title field while jose would go against the default search field... If you want to be really thorough, also post the results of your query with debugQuery=on Schema browser in your SOLR admin page might help, and Luke can be used to examin what's actually in your index. Best Erick On Mon, Dec 7, 2009 at 6:36 PM, insaneyogi3008 insaney...@gmail.com wrote: Hello, I was performing a search on different versions of the term San Jose on my Solr Instance , the differing versions being : san jose(all lowercase) San jose(One uppercase) San Jose (Capital first letters) SAN JOSE (ALL Caps) each of these phrases return a different number of hits back as response objects . for example san jose returns - result name=response numFound=0 start=0 San jose returns -result name=response numFound=4 start=0 San Jose returns -result name=response numFound=16 start=0 SAN JOSE returns - result name=response numFound=853 start=0 How do I make my search not case sensitive? -- View this message in context: http://old.nabble.com/Inconsistent-Search-Results-for-different-flavors-of-same-search-term-tp26686294p26686294.html Sent from the Solr - Dev mailing list archive at Nabble.com.
Re: Inconsistent Search Results for different flavors of same search term
Look at http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters. But before you make changes, get familiar with the analysis section of the admin interface: http://localhost:8983/solr/admin/analysis.jsp?highlight=on Of course, adjust the path for your server. This will let you see what the analyzers are doing at index and query time, and is VERY helpful in understanding the analysis process. Tom On Mon, Dec 7, 2009 at 3:36 PM, insaneyogi3008 insaney...@gmail.com wrote: Hello, I was performing a search on different versions of the term San Jose on my Solr Instance , the differing versions being : san jose(all lowercase) San jose(One uppercase) San Jose (Capital first letters) SAN JOSE (ALL Caps) each of these phrases return a different number of hits back as response objects . for example san jose returns - result name=response numFound=0 start=0 San jose returns -result name=response numFound=4 start=0 San Jose returns -result name=response numFound=16 start=0 SAN JOSE returns - result name=response numFound=853 start=0 How do I make my search not case sensitive? -- View this message in context: http://old.nabble.com/Inconsistent-Search-Results-for-different-flavors-of-same-search-term-tp26686294p26686294.html Sent from the Solr - Dev mailing list archive at Nabble.com.
Re: Inconsistent Search Results for different flavors of same search term
I resolved this kind of situations by a) while indexing converted to lower case in DIH and also converting free text keywords to lowercase in the client code before sending it to Solr. pradeep. --- On Mon, 12/7/09, insaneyogi3008 insaney...@gmail.com wrote: From: insaneyogi3008 insaney...@gmail.com Subject: Inconsistent Search Results for different flavors of same search term To: solr-dev@lucene.apache.org Date: Monday, December 7, 2009, 3:36 PM Hello, I was performing a search on different versions of the term San Jose on my Solr Instance , the differing versions being : san jose(all lowercase) San jose(One uppercase) San Jose (Capital first letters) SAN JOSE (ALL Caps) each of these phrases return a different number of hits back as response objects . for example san jose returns - result name=response numFound=0 start=0 San jose returns -result name=response numFound=4 start=0 San Jose returns -result name=response numFound=16 start=0 SAN JOSE returns - result name=response numFound=853 start=0 How do I make my search not case sensitive? -- View this message in context: http://old.nabble.com/Inconsistent-Search-Results-for-different-flavors-of-same-search-term-tp26686294p26686294.html Sent from the Solr - Dev mailing list archive at Nabble.com.
Re: SOLR-1131 - Multiple Fields per Field Type
Hi Hoss, : fieldType name=latlon type=LatLonFieldType pattern=location__* / : fieldType name=latlon_home type=LatLonFieldType pattern=location_home_*/ : fieldType name=latlon_work type=LatLonFieldType pattern=location_home_*/ : : field name=location type=latlon/ : field name=location_home type=latlon_home/ : field name=location_work type=latlon_work/ I'm not really understanding the value of an approach like that. for starters, what Lucene field names would ultimately be created in those examples? The first field would be named location__location. The second field would be named location_home_location_home. The third field would be named location_work_location_work. And if i also added... field name=other_location type=latlon/ dynamicField name=*_dynamic_location type=latlon/ ...then what field names would be created under the covers? In general, it would be FieldType#getPattern().stripOffEndRegexStarStuff() + Field#getName(). : I think it makes more sense to define the heterogeneity at the fieldType level because: : : (a) it's a bit more consistent with the existing solr schema examples, : where the difference between many of the field types (e.g., ints and : tints, which are both solr.TrieIntField's, date and tdate, both : instances of solr.TrieDateField, with different configuration, etc.) : : (b) isolation of change: fieldType defs will change less often than : field defs, where names and indexed/stored/etc. debugging are likely : to occur more frequently ...this just feels wrong to me ... i can't really explain why. It seems like you are suggesting thatt every field/ declaration would need a one to one corrispondence with a unique fieldType/ declaration in order to prevent field name collisions, which sounds sketchy enough ... but i'm also not fond of the idea that a person editing the schema can't just look at the field/ and dynamicField/ names to ensure that they understand what underlying fields are being created (so they don't inadvertantly add a new one that collides) ... now they also have to look at the pattern attribute of every fieldType/ that is a poly field. Well if this feels wrong to you then I think the schema.xml file that ships with SOLR should also feel wrong as well because it uses the exact same pattern for defining field type variations. That is, differences between FieldType representations for ints and tints are not stored as variations on the SchemaField definition itself but they are stored as variation on the FieldTypes (e.g., a different precisionStep in the case of int [0] versus that of tint [8]). Based on what you are proposing, why isn't precisionStep an attribute on field, rather than fieldType in those examples? letting dynamicField/ drive everything just seems a *lot* simpler ... both as far as implementation, and as far as maintaining the schema. Possibly. It's also a lot less traceable. It's implicit versus explicit, which I'm not sure leads to simplicity in the end. : I don't think the above hybrid approach will lead to anything other than : confusion, as you indicated above. Let's stick to the pattern defs at : the fieldType level, and then let the fieldType handle the internal : dynamicity with e.g., a dynamicField, and then notify the schema user From the standpoint of reading a schema.xml file, the approach you're describing of a pattern attribute on fieldType/ declarations actaully seems more confusing then the strawman suggestion i made of a pattern attribute on field ... even without understanding what concrete feilds you are suggesting would be created with a configuration like that, it still increases the number of places you have to look to see what field names are getting created. How so? In actuality, it reduces it. Instead of having pattern definitions on fields (which there is a greater chance of having more of), you have them on field types? Cheers, Chris ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.mattm...@jpl.nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
[jira] Updated: (SOLR-1586) Create Spatial Point FieldTypes
[ https://issues.apache.org/jira/browse/SOLR-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated SOLR-1586: Attachment: SOLR-1586.Mattmann.120709.geohashonly.patch.txt updated patch containing only the geohash goodies. Create Spatial Point FieldTypes --- Key: SOLR-1586 URL: https://issues.apache.org/jira/browse/SOLR-1586 Project: Solr Issue Type: Improvement Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Fix For: 1.5 Attachments: examplegeopointdoc.patch.txt, SOLR-1586.Mattmann.112209.geopointonly.patch.txt, SOLR-1586.Mattmann.112209.geopointonly.patch.txt, SOLR-1586.Mattmann.112409.geopointandgeohash.patch.txt, SOLR-1586.Mattmann.112409.geopointandgeohash.patch.txt, SOLR-1586.Mattmann.112509.geopointandgeohash.patch.txt, SOLR-1586.Mattmann.120709.geohashonly.patch.txt Per SOLR-773, create field types that hid the details of creating tiers, geohash and lat/lon fields. Fields should take in lat/lon points in a single form, as in: field name=foolat lon/field -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1632) Distributed IDF
Distributed IDF --- Key: SOLR-1632 URL: https://issues.apache.org/jira/browse/SOLR-1632 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.5 Reporter: Andrzej Bialecki Distributed IDF is a valuable enhancement for distributed search across non-uniform shards. This issue tracks the proposed implementation of an API to support this functionality in Solr. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1632) Distributed IDF
[ https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki updated SOLR-1632: Attachment: distrib.patch Initial implementation. This supports the current global IDF (i.e. none ;) ), and an exact version of global IDF that requires one additional request per query to obtain per-shard stats. The design should be already flexible enough to implement LRU caching of docFreqs, and ultimately to implement other methods for global IDF calculation (e.g. based on estimation or re-ranking). Distributed IDF --- Key: SOLR-1632 URL: https://issues.apache.org/jira/browse/SOLR-1632 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.5 Reporter: Andrzej Bialecki Attachments: distrib.patch Distributed IDF is a valuable enhancement for distributed search across non-uniform shards. This issue tracks the proposed implementation of an API to support this functionality in Solr. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1621) Allow current single core deployments to be specified by solr.xml
[ https://issues.apache.org/jira/browse/SOLR-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787303#action_12787303 ] Noble Paul commented on SOLR-1621: -- bq.if they cause too much grief, we always have the option to remove. Do we really have a usecase for ALIAS ? if there is no not compelling enough usecase we should consider removing it. there is a lot of code which is there just becaus eof this alias feature Allow current single core deployments to be specified by solr.xml - Key: SOLR-1621 URL: https://issues.apache.org/jira/browse/SOLR-1621 Project: Solr Issue Type: New Feature Affects Versions: 1.5 Reporter: Noble Paul Fix For: 1.5 Attachments: SOLR-1621.patch, SOLR-1621.patch, SOLR-1621.patch, SOLR-1621.patch, SOLR-1621.patch supporting two different modes of deployments is turning out to be hard. This leads to duplication of code. Moreover there is a lot of confusion on where do we put common configuration. See the mail thread http://markmail.org/message/3m3rqvp2ckausjnf -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-1621) Allow current single core deployments to be specified by solr.xml
[ https://issues.apache.org/jira/browse/SOLR-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787303#action_12787303 ] Noble Paul edited comment on SOLR-1621 at 12/8/09 4:29 AM: --- bq.if they cause too much grief, we always have the option to remove. Do we really have a usecase for ALIAS ? if there is no no compelling enough usecase we should consider removing it. there is a lot of code which is there just becaus eof this alias feature was (Author: noble.paul): bq.if they cause too much grief, we always have the option to remove. Do we really have a usecase for ALIAS ? if there is no not compelling enough usecase we should consider removing it. there is a lot of code which is there just becaus eof this alias feature Allow current single core deployments to be specified by solr.xml - Key: SOLR-1621 URL: https://issues.apache.org/jira/browse/SOLR-1621 Project: Solr Issue Type: New Feature Affects Versions: 1.5 Reporter: Noble Paul Fix For: 1.5 Attachments: SOLR-1621.patch, SOLR-1621.patch, SOLR-1621.patch, SOLR-1621.patch, SOLR-1621.patch supporting two different modes of deployments is turning out to be hard. This leads to duplication of code. Moreover there is a lot of confusion on where do we put common configuration. See the mail thread http://markmail.org/message/3m3rqvp2ckausjnf -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1621) Allow current single core deployments to be specified by solr.xml
[ https://issues.apache.org/jira/browse/SOLR-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-1621: - Attachment: SOLR-1621.patch it works now even after the alias command . I still think we should remove the 'alias' command. It is a fancy feature which adds too much of complexity into ref counting of cores Allow current single core deployments to be specified by solr.xml - Key: SOLR-1621 URL: https://issues.apache.org/jira/browse/SOLR-1621 Project: Solr Issue Type: New Feature Affects Versions: 1.5 Reporter: Noble Paul Fix For: 1.5 Attachments: SOLR-1621.patch, SOLR-1621.patch, SOLR-1621.patch, SOLR-1621.patch, SOLR-1621.patch, SOLR-1621.patch supporting two different modes of deployments is turning out to be hard. This leads to duplication of code. Moreover there is a lot of confusion on where do we put common configuration. See the mail thread http://markmail.org/message/3m3rqvp2ckausjnf -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Best practices in Solr Schema Design
Sri, please post your questions to solr-user list. This list is for Solr's internal development discussions only. On Tue, Dec 8, 2009 at 2:35 AM, insaneyogi3008 insaney...@gmail.com wrote: Hello everybody, I am trying to port data from a PostGRESql database instance onto Solr , I am getting flummoxed in coming up with an efficient schema design . Could somebody give me general pointers in how I can go about achieving this ? With Regards Sri -- View this message in context: http://old.nabble.com/Best-practices-in-Solr-Schema-Design-tp26684128p26684128.html Sent from the Solr - Dev mailing list archive at Nabble.com. -- Regards, Shalin Shekhar Mangar.
[jira] Updated: (SOLR-1583) Create DataSources that return InputStream
[ https://issues.apache.org/jira/browse/SOLR-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-1583: - Fix Version/s: 1.5 Create DataSources that return InputStream -- Key: SOLR-1583 URL: https://issues.apache.org/jira/browse/SOLR-1583 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: Noble Paul Assignee: Noble Paul Priority: Minor Fix For: 1.5 Attachments: SOLR-1583.patch Tika integration means the source has to be binary that is the DataSource must be of type DataSourceInputStream . All the DataSourceReader should have a binary counterpart. * BinURLDataSourceInputStream * BinContentStreamDataSourceInputStream * BinFileDataOurceInputStream -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1583) Create DataSources that return InputStream
[ https://issues.apache.org/jira/browse/SOLR-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul resolved SOLR-1583. -- Resolution: Fixed committed r888277 Create DataSources that return InputStream -- Key: SOLR-1583 URL: https://issues.apache.org/jira/browse/SOLR-1583 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: Noble Paul Assignee: Noble Paul Priority: Minor Attachments: SOLR-1583.patch Tika integration means the source has to be binary that is the DataSource must be of type DataSourceInputStream . All the DataSourceReader should have a binary counterpart. * BinURLDataSourceInputStream * BinContentStreamDataSourceInputStream * BinFileDataOurceInputStream -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1358) Integration of Tika and DataImportHandler
[ https://issues.apache.org/jira/browse/SOLR-1358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-1358: - Summary: Integration of Tika and DataImportHandler (was: Integration of Solr Cell and DataImportHandler) Integration of Tika and DataImportHandler - Key: SOLR-1358 URL: https://issues.apache.org/jira/browse/SOLR-1358 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: Sascha Szott At the moment, it's impossible to configure Solr such that it build up documents by using data that comes from both pdf documents and database table columns. Currently, to accomplish this task, it's up to the user to add some preprocessing that converts pdf files into plain text files. Therefore, I would like to see an integration of Solr Cell into DIH that makes those preprocessing obsolete. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (SOLR-1358) Integration of Tika and DataImportHandler
[ https://issues.apache.org/jira/browse/SOLR-1358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul reassigned SOLR-1358: Assignee: Noble Paul Integration of Tika and DataImportHandler - Key: SOLR-1358 URL: https://issues.apache.org/jira/browse/SOLR-1358 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: Sascha Szott Assignee: Noble Paul At the moment, it's impossible to configure Solr such that it build up documents by using data that comes from both pdf documents and database table columns. Currently, to accomplish this task, it's up to the user to add some preprocessing that converts pdf files into plain text files. Therefore, I would like to see an integration of Solr Cell into DIH that makes those preprocessing obsolete. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-1358) Integration of Tika and DataImportHandler
[ https://issues.apache.org/jira/browse/SOLR-1358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12750855#action_12750855 ] Noble Paul edited comment on SOLR-1358 at 12/8/09 6:36 AM: --- Let us provide a new TikaEntityProcessor {code:xml} dataConfig !-- use any of type DataSourceInputStream -- dataSource type=BinURLDataSource/ document entity processor=TikaEntityProcessor tikaConfig=tikaconfig.xml url=${some.var.goes.here} /entity document /dataConfig {code} This most likely would need a BinUrlDataSource/BinContentStreamDataSource because Tika uses binary inputs. My suggestion is that TikaEntityProcessor live in the extraction contrib so that managing dependencies is easier. But we will have to make extraction have a compile-time dependency on DIH. Grant , what do you think? was (Author: noble.paul): Let us provide a new TikaEntityProcessor {code:xml} entity processor=TikaEntityProcessor tikaConfig=tikaconfig.xml url=${some.var.goes.here} /entity {code} This most likely would need a BinUrlDataSource/BinContentStreamDataSource because Tika uses binary inputs. My suggestion is that TikaEntityProcessor live in the extraction contrib so that managing dependencies is easier. But we will have to make extraction have a compile-time dependency on DIH. Grant , what do you think? Integration of Tika and DataImportHandler - Key: SOLR-1358 URL: https://issues.apache.org/jira/browse/SOLR-1358 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: Sascha Szott Assignee: Noble Paul At the moment, it's impossible to configure Solr such that it build up documents by using data that comes from both pdf documents and database table columns. Currently, to accomplish this task, it's up to the user to add some preprocessing that converts pdf files into plain text files. Therefore, I would like to see an integration of Solr Cell into DIH that makes those preprocessing obsolete. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1629) Return to admin page link on registry.jsp goes to wrong page
[ https://issues.apache.org/jira/browse/SOLR-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar resolved SOLR-1629. - Resolution: Fixed Fix Version/s: 1.5 Committed revision 888281. Thanks Michael! Return to admin page link on registry.jsp goes to wrong page -- Key: SOLR-1629 URL: https://issues.apache.org/jira/browse/SOLR-1629 Project: Solr Issue Type: Bug Components: web gui Reporter: Michael Ryan Assignee: Shalin Shekhar Mangar Priority: Minor Fix For: 1.5 The Return to admin page link on admin/registry.jsp links to the current page. http://svn.apache.org/viewvc/lucene/solr/trunk/src/webapp/web/admin/registry.xsl?revision=815587view=markup Change a href=Return to Admin Page/a to a href=.Return to Admin Page/a. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1620) log message created null misleading
[ https://issues.apache.org/jira/browse/SOLR-1620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar resolved SOLR-1620. - Resolution: Fixed Fix Version/s: 1.5 Committed revision 888282. Thanks KuroSaka! log message created null misleading - Key: SOLR-1620 URL: https://issues.apache.org/jira/browse/SOLR-1620 Project: Solr Issue Type: Bug Affects Versions: 1.4 Reporter: KuroSaka TeruHiko Assignee: Shalin Shekhar Mangar Priority: Minor Fix For: 1.5 Original Estimate: 0.5h Remaining Estimate: 0.5h Solr logs a message like this: {noformat} INFO: created null: org.apache.solr.analysis.LowerCaseFilterFactory {noformat} This sounds like the TokenFilter or Tokenizer were not created and a serious error. But it mealy means the component is not named. null is printed because the local variable name has the value null. This is misleading. If the text field type is not named, it should just print blank, rather than the word null. I would suggest that a line in src/java/org/apache/solr/util/plugin/AbstractPluginLoader.java be changed to: {noformat} log.info(created+((name!=null)?( +name):)+: + plugin.getClass().getName() ); {noformat} from {noformat} log.info(created +name+: + plugin.getClass().getName() ); {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.