date:20091207

Variableresolver should be fetched just in time
---

 Key: SOLR-1627
 URL: https://issues.apache.org/jira/browse/SOLR-1627
 Project: Solr
  Issue Type: Improvement
  Components: contrib - DataImportHandler
Reporter: Noble Paul
Priority: Minor
 Fix For: 1.5


The VariableResolver instance may vary from time to time from SOLR-1352. So get 
it Just in time. For most cases iuse Context#resolve() and 
Context#replaceTokens() 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1627) Variableresolver should be fetched just in time


 [ 
https://issues.apache.org/jira/browse/SOLR-1627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-1627:
-

Attachment: SOLR-1627.patch

 Variableresolver should be fetched just in time
 ---

 Key: SOLR-1627
 URL: https://issues.apache.org/jira/browse/SOLR-1627
 Project: Solr
  Issue Type: Improvement
  Components: contrib - DataImportHandler
Reporter: Noble Paul
Assignee: Noble Paul
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1627.patch


 The VariableResolver instance may vary from time to time from SOLR-1352. So 
 get it Just in time. For most cases iuse Context#resolve() and 
 Context#replaceTokens() 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (SOLR-1627) Variableresolver should be fetched just in time


 [ 
https://issues.apache.org/jira/browse/SOLR-1627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul reassigned SOLR-1627:


Assignee: Noble Paul

 Variableresolver should be fetched just in time
 ---

 Key: SOLR-1627
 URL: https://issues.apache.org/jira/browse/SOLR-1627
 Project: Solr
  Issue Type: Improvement
  Components: contrib - DataImportHandler
Reporter: Noble Paul
Assignee: Noble Paul
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1627.patch


 The VariableResolver instance may vary from time to time from SOLR-1352. So 
 get it Just in time. For most cases iuse Context#resolve() and 
 Context#replaceTokens() 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: [jira] Issue Comment Edited: (SOLR-236) Field collapsing

2009-12-07 Thread Marc Sturlese


Hey there, I have beeb testing the last patch and I think or I am missing
something or the way to show the collapsed documents when adjacent collapse
can be sometimes confusing:
I am using the patch replacing queryComponent for collapseComponent (not
using both at same time):
  searchComponent name=query
class=org.apache.solr.handler.component.CollapseComponent
What I have noticed is, imagin you get these results in the search:
doc1:
   id:001
   collapseField:ccc
doc2:
   id:002
   collapseField:aaa
doc3:
   id:003
   collapseField:ccc
doc4:
   id:004
   collapseField:bbb

And in the collapse_counts you get:
int name=collapseCount1/int
str name=fieldValueccc/str
result name=collapsedDocs numFound=1 start=0
doc
long name=id008/long
str name=contentaaa aaa/str
str name=colccc/str
/doc
/result

Now, how can I know the head document of doc 008? Both 001 and 003 could
be... wouldn't make sense to connect in someway  the uniqueField with the
collapsed documents?

Adding something to collapse_counts like:
int name=collapseCount1/int
str name=fieldValueccc/str
str name=uniqueFieldId003/str

I currently have hacked FieldValueCountCollapseCollectorFactory to return:
str name=fieldValueccc#003/str
but this respose looks dirty...

As I said maybe I am missunderstanding something and this can be knwon in
someway. In that case can someone tell me how?
Thanks in advance






JIRA j...@apache.org wrote:
 
 
 [
 https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12783484#action_12783484
 ] 
 
 Martijn van Groningen edited comment on SOLR-236 at 11/29/09 9:56 PM:
 --
 
 I have attached a new patch that has the following changes:
 # Added caching for the field collapse functionality. Check the [solr
 wiki|http://wiki.apache.org/solr/FieldCollapsing] for how to configure
 field-collapsing with caching.
 # Removed the collapse.max parameter (collapse.threshold must be used
 instead). It was deprecated for a long time. 
 
   was (Author: martijn):
 I have attached a new patch that has the following changes:
 # Added caching for the field collapse functionality. Check the [solr
 wiki|http://wiki.apache.org/solr/FieldCollapsing] for how to configure the
 field-collapsing with caching.
 # Removed the collapse.max parameter (collapse.threshold must be used
 instead). It was deprecated for a long time. 
   
 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch,
 collapsing-patch-to-1.3.0-ivan.patch,
 collapsing-patch-to-1.3.0-ivan_2.patch,
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch,
 field-collapse-4-with-solrj.patch, field-collapse-5.patch,
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch,
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch,
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch,
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch,
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch,
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch,
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff,
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff,
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch,
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch,
 solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a
 given field to a single entry in the result set. Site collapsing is a
 special case of this, where all results for a given web site is collapsed
 into one or two entries in the result set, typically with an associated
 more documents from this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)
 
 -- 
 This message is automatically generated by JIRA.
 -
 You can reply to this email to add a comment to the issue online.
 
 
 

-- 
View this message in context:

[jira] Updated: (SOLR-1627) Variableresolver should be fetched just in time


 [ 
https://issues.apache.org/jira/browse/SOLR-1627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-1627:
-

Attachment: SOLR-1627.patch

 Variableresolver should be fetched just in time
 ---

 Key: SOLR-1627
 URL: https://issues.apache.org/jira/browse/SOLR-1627
 Project: Solr
  Issue Type: Improvement
  Components: contrib - DataImportHandler
Reporter: Noble Paul
Assignee: Noble Paul
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1627.patch, SOLR-1627.patch


 The VariableResolver instance may vary from time to time from SOLR-1352. So 
 get it Just in time. For most cases iuse Context#resolve() and 
 Context#replaceTokens() 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: [jira] Issue Comment Edited: (SOLR-236) Field collapsing

Hi Marc,

I'm not sure if I follow you completely, but the example you gave is
not complete. I'm missing a few tags in your example. Lets assume the
following response that the latest patches produce.

lst name=collapse_counts
str name=fieldcat/str
lst name=results
lst name=009
str name=fieldValuehard/str
int name=collapseCount1/int
result name=collapsedDocs numFound=1 start=0
 doc
long name=id008/long
str name=contentaaa aaa/str
str name=colccc/str
 /doc
/result
/lst
...
/lst
/lst

The result list contains collapse groups. The name of the child
elements are the collapse head ids. Everything that falls under the
collapse head belongs to that collapse group and thus adding document
head id to the field value is unnecessary.  In the above example
document with id 009 is the document head of document with id 008.
Document with id 009 should be displayed in the search result.

From what you have said, it seems that you properly configured the patch.

Martijn

2009/12/7 Marc Sturlese marc.sturl...@gmail.com:

 Hey there, I have beeb testing the last patch and I think or I am missing
 something or the way to show the collapsed documents when adjacent collapse
 can be sometimes confusing:
 I am using the patch replacing queryComponent for collapseComponent (not
 using both at same time):
  searchComponent name=query
 class=org.apache.solr.handler.component.CollapseComponent
 What I have noticed is, imagin you get these results in the search:
 doc1:
   id:001
   collapseField:ccc
 doc2:
   id:002
   collapseField:aaa
 doc3:
   id:003
   collapseField:ccc
 doc4:
   id:004
   collapseField:bbb

 And in the collapse_counts you get:
 int name=collapseCount1/int
 str name=fieldValueccc/str
 result name=collapsedDocs numFound=1 start=0
 doc
 long name=id008/long
 str name=contentaaa aaa/str
 str name=colccc/str
 /doc
 /result

 Now, how can I know the head document of doc 008? Both 001 and 003 could
 be... wouldn't make sense to connect in someway  the uniqueField with the
 collapsed documents?

 Adding something to collapse_counts like:
 int name=collapseCount1/int
 str name=fieldValueccc/str
 str name=uniqueFieldId003/str

 I currently have hacked FieldValueCountCollapseCollectorFactory to return:
 str name=fieldValueccc#003/str
 but this respose looks dirty...

 As I said maybe I am missunderstanding something and this can be knwon in
 someway. In that case can someone tell me how?
 Thanks in advance






 JIRA j...@apache.org wrote:


     [
 https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12783484#action_12783484
 ]

 Martijn van Groningen edited comment on SOLR-236 at 11/29/09 9:56 PM:
 --

 I have attached a new patch that has the following changes:
 # Added caching for the field collapse functionality. Check the [solr
 wiki|http://wiki.apache.org/solr/FieldCollapsing] for how to configure
 field-collapsing with caching.
 # Removed the collapse.max parameter (collapse.threshold must be used
 instead). It was deprecated for a long time.

       was (Author: martijn):
     I have attached a new patch that has the following changes:
 # Added caching for the field collapse functionality. Check the [solr
 wiki|http://wiki.apache.org/solr/FieldCollapsing] for how to configure the
 field-collapsing with caching.
 # Removed the collapse.max parameter (collapse.threshold must be used
 instead). It was deprecated for a long time.

 Field collapsing
 

                 Key: SOLR-236
                 URL: https://issues.apache.org/jira/browse/SOLR-236
             Project: Solr
          Issue Type: New Feature
          Components: search
    Affects Versions: 1.3
            Reporter: Emmanuel Keller
             Fix For: 1.5

         Attachments: collapsing-patch-to-1.3.0-dieter.patch,
 collapsing-patch-to-1.3.0-ivan.patch,
 collapsing-patch-to-1.3.0-ivan_2.patch,
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch,
 field-collapse-4-with-solrj.patch, field-collapse-5.patch,
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch,
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch,
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch,
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch,
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch,
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch,
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff,
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff,
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch,
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch,

[jira] Created: (SOLR-1628) log contains incorrect number of adds and deletes

2009-12-07 Thread Yonik Seeley (JIRA)

log contains incorrect number of adds and deletes
-

 Key: SOLR-1628
 URL: https://issues.apache.org/jira/browse/SOLR-1628
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Yonik Seeley
 Fix For: 1.5


LogUpdateProcessorFactory logs the wrong number of deletes/adds if more than 8.
http://search.lucidimagination.com/search/document/f75c6a5a58e205a4/minor_nit

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: [jira] Issue Comment Edited: (SOLR-236) Field collapsing

Yes it should look similar to that. What is the exact request you send to Solr?
Also to check if the patch works correctly can you run: ant clean test
There are a number of tests that test the Field collapse functionality.

Martijn


2009/12/7 Marc Sturlese marc.sturl...@gmail.com:

lst name=collapse_counts
   str name=fieldcat/str
    lst name=results
        lst name=009
            str name=fieldValuehard/str
           int name=collapseCount1/int
            result name=collapsedDocs numFound=1 start=0
                 doc
                    long name=id008/long
                    str name=contentaaa aaa/str
                    str name=colccc/str
                 /doc
            /result
        /lst
        ...
    /lst
/lst
 I see, looks like I am applying the patch wrongly somehow.
 This the complete collapse_counts response I am getting:
 lst name=collapse_counts
  str name=fieldcol/str
  lst name=results
    lst
      int name=collapseCount1/int
      int name=collapseCount1/int
      int name=collapseCount1/int
      str name=fieldValuebbb/str
      str name=fieldValueccc/str
      str name=fieldValuexxx/str
      result name=collapsedDocs numFound=1 start=0
        doc
          long name=id2/long
          str name=contentaaa aaa/str
          str name=colbbb/str
        /doc
      /result
      result name=collapsedDocs numFound=1 start=0
        doc
          long name=id8/long
          str name=contentaaa aaa aaa sd/str
          str name=colccc/str
       /doc
      /result
      result name=collapsedDocs numFound=4 start=0
        doc
          long name=id12/long
          str name=contentaaa aaa aaa v/str
          str name=colxxx/str
        /doc
      /result
    /lst
  /lst
 /lst

 As you can see I am getting a lst tag with no name. As I understood what
 you told me. I should be getting as many lst tags as collapsed groups and
 the name attribute of the lst should be the unique field value. So, if the
 patch was applyed correcly teh response should look like:

 lst name=collapse_counts
  str name=fieldcol/str
  lst name=results
    lst name=354 (the head value of the collapsed group)
      int name=collapseCount1/int
      str name=fieldValuebbb/str
      result name=collapsedDocs numFound=1 start=0
        doc
          long name=id2/long
          str name=contentaaa aaa/str
          str name=colbbb/str
        /doc
      /result
    /lst
    lst name=654
      int name=collapseCount1/int
      str name=fieldValueccc/str
      result name=collapsedDocs numFound=1 start=0
        doc
          long name=id8/long
          str name=contentaaa aaa aaa sd/str
          str name=colccc/str
       /doc
      /result
    /lst
    lst name=654
      int name=collapseCount1/int
      str name=fieldValuexxx/str
      result name=collapsedDocs numFound=4 start=0
        doc
          long name=id12/long
          str name=contentaaa aaa aaa v/str
          str name=colxxx/str
        /doc
      /result
    /lst
  /lst
 /lst

 Is this the way the response looks like when you use teh patch?
 Thanks in advance


 Martijn v Groningen wrote:

 Hi Marc,

 I'm not sure if I follow you completely, but the example you gave is
 not complete. I'm missing a few tags in your example. Lets assume the
 following response that the latest patches produce.

 lst name=collapse_counts
     str name=fieldcat/str
     lst name=results
         lst name=009
             str name=fieldValuehard/str
             int name=collapseCount1/int
             result name=collapsedDocs numFound=1 start=0
                  doc
                     long name=id008/long
                     str name=contentaaa aaa/str
                     str name=colccc/str
                  /doc
             /result
         /lst
         ...
     /lst
 /lst

 The result list contains collapse groups. The name of the child
 elements are the collapse head ids. Everything that falls under the
 collapse head belongs to that collapse group and thus adding document
 head id to the field value is unnecessary.  In the above example
 document with id 009 is the document head of document with id 008.
 Document with id 009 should be displayed in the search result.

 From what you have said, it seems that you properly configured the patch.

 Martijn

 2009/12/7 Marc Sturlese marc.sturl...@gmail.com:

 Hey there, I have beeb testing the last patch and I think or I am missing
 something or the way to show the collapsed documents when adjacent
 collapse
 can be sometimes confusing:
 I am using the patch replacing queryComponent for collapseComponent (not
 using both at same time):
  searchComponent name=query
 class=org.apache.solr.handler.component.CollapseComponent
 What I have noticed is, imagin you get these results in the search:
 doc1:
   id:001
   collapseField:ccc
 doc2:
   id:002
   collapseField:aaa
 doc3:
   id:003
   collapseField:ccc
 doc4:
   id:004
   collapseField:bbb

 And in the collapse_counts you get:
 int name=collapseCount1/int

[jira] Updated: (SOLR-1131) Allow a single field type to index multiple fields

[
https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Grant Ingersoll updated SOLR-1131:
--

Attachment: SOLR-1131.patch

OK, here's my take on this. I took Yonik's and merged it w/ a patch I had in
the works. It's not done, but all tests pass, including the new on I added
(PolyFieldTest). Yonik's move to put getFieldQuery in FieldType was just the
key to answering the question of how to generate queries given a FieldType.

Notes:
1. I changed the Geo examples to be CoordinateFieldType (representing an
abstract coordinate system) and then PointFieldType which represents a point in
an n-dimensional space (default 2D). I think from this, we could easily add
things like PolygonFieldType, etc. which would allow us to create more
sophisticated shapes and do things like intersections, etc. For instance,
imagine saying: Does this point lie within this shape? I think that might be
able to be expressed as a RangeQuery
2. I'm not sure I care for the name of the new abstract FieldType that is a
base class of CoordinateFieldType called DelegatingFieldType
3. I'm not sure yet on the properties of the generated fields just yet. Right
now, I'm delegating the handling to the sub FieldType except I'm overriding to
turn off storage, which I think is pretty cool (could even work as a copy field
like functionality)
4. I'm not thrilled about creating a SchemaField every time in the createFields
protected helper method, but SchemaField is final and doesn't have a setName
method (which makes sense)

Questions for Yonik on his patch:
1. Why is TextField overriding getFieldQuery when it isn't called, except
possibly via the FieldQParserPlugin?
2. I'm not sure I understand the getDistance, getBoundingBox methods on the
GeoFieldType. It seems like that precludes one from picking a specific
distance (for instance, some times you may want a faster approx. and others a
slower)

Needs:
1. Write up changes.txt
2. More tests, including performance testing
3. Patch doesn't support dynamic fields yet, but it should

Allow a single field type to index multiple fields
--

Key: SOLR-1131
URL: https://issues.apache.org/jira/browse/SOLR-1131
Project: Solr
Issue Type: New Feature
Components: Schema and Analysis
Reporter: Ryan McKinley
Assignee: Grant Ingersoll
Fix For: 1.5

Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch,
SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch

In a few special cases, it makes sense for a single field (the concept) to
be indexed as a set of Fields (lucene Field). Consider SOLR-773. The
concept point may be best indexed in a variety of ways:
* geohash (sincle lucene field)
* lat field, lon field (two double fields)
* cartesian tiers (a series of fields with tokens to say if it exists within
that region)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1628) log contains incorrect number of adds and deletes

2009-12-07 Thread Yonik Seeley (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley resolved SOLR-1628.


Resolution: Fixed

committed fix.

 log contains incorrect number of adds and deletes
 -

 Key: SOLR-1628
 URL: https://issues.apache.org/jira/browse/SOLR-1628
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Yonik Seeley
 Fix For: 1.5


 LogUpdateProcessorFactory logs the wrong number of deletes/adds if more than 
 8.
 http://search.lucidimagination.com/search/document/f75c6a5a58e205a4/minor_nit

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields


[ 
https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786951#action_12786951
 ] 

Chris A. Mattmann commented on SOLR-1131:
-

Patch is looking good! I'm pouring through it right now -- I'll try and test 
this as part of work I'm doing on SOLR-1586 -- maybe even update that issue if 
I get a sec today :)

 Allow a single field type to index multiple fields
 --

 Key: SOLR-1131
 URL: https://issues.apache.org/jira/browse/SOLR-1131
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Reporter: Ryan McKinley
Assignee: Grant Ingersoll
 Fix For: 1.5

 Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, 
 SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch


 In a few special cases, it makes sense for a single field (the concept) to 
 be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The 
 concept point may be best indexed in a variety of ways:
  * geohash (sincle lucene field)
  * lat field, lon field (two double fields)
  * cartesian tiers (a series of fields with tokens to say if it exists within 
 that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: [jira] Issue Comment Edited: (SOLR-236) Field collapsing

2009-12-07 Thread Marc Sturlese


The request I am sending is:
http://localhost:8983/solr/select/?q=aaaversion=2.2start=0rows=20indent=oncollapse.field=colcollapse.includeCollapsedDocs.fl=*collapse.type=adjacentcollapse.info.doc=truecollapse.info.count=true

I search for 'aaa' in the content field. All the documents in the result
contain that string in the field content

Martijn v Groningen wrote:
 
 Yes it should look similar to that. What is the exact request you send to
 Solr?
 Also to check if the patch works correctly can you run: ant clean test
 There are a number of tests that test the Field collapse functionality.
 
 Martijn
 
 
 2009/12/7 Marc Sturlese marc.sturl...@gmail.com:

lst name=collapse_counts
   str name=fieldcat/str
    lst name=results
        lst name=009
            str name=fieldValuehard/str
           int name=collapseCount1/int
            result name=collapsedDocs numFound=1 start=0
                 doc
                    long name=id008/long
                    str name=contentaaa aaa/str
                    str name=colccc/str
                 /doc
            /result
        /lst
        ...
    /lst
/lst
 I see, looks like I am applying the patch wrongly somehow.
 This the complete collapse_counts response I am getting:
 lst name=collapse_counts
  str name=fieldcol/str
  lst name=results
    lst
      int name=collapseCount1/int
      int name=collapseCount1/int
      int name=collapseCount1/int
      str name=fieldValuebbb/str
      str name=fieldValueccc/str
      str name=fieldValuexxx/str
      result name=collapsedDocs numFound=1 start=0
        doc
          long name=id2/long
          str name=contentaaa aaa/str
          str name=colbbb/str
        /doc
      /result
      result name=collapsedDocs numFound=1 start=0
        doc
          long name=id8/long
          str name=contentaaa aaa aaa sd/str
          str name=colccc/str
       /doc
      /result
      result name=collapsedDocs numFound=4 start=0
        doc
          long name=id12/long
          str name=contentaaa aaa aaa v/str
          str name=colxxx/str
        /doc
      /result
    /lst
  /lst
 /lst

 As you can see I am getting a lst tag with no name. As I understood
 what
 you told me. I should be getting as many lst tags as collapsed groups and
 the name attribute of the lst should be the unique field value. So, if
 the
 patch was applyed correcly teh response should look like:

 lst name=collapse_counts
  str name=fieldcol/str
  lst name=results
    lst name=354 (the head value of the collapsed group)
      int name=collapseCount1/int
      str name=fieldValuebbb/str
      result name=collapsedDocs numFound=1 start=0
        doc
          long name=id2/long
          str name=contentaaa aaa/str
          str name=colbbb/str
        /doc
      /result
    /lst
    lst name=654
      int name=collapseCount1/int
      str name=fieldValueccc/str
      result name=collapsedDocs numFound=1 start=0
        doc
          long name=id8/long
          str name=contentaaa aaa aaa sd/str
          str name=colccc/str
       /doc
      /result
    /lst
    lst name=654
      int name=collapseCount1/int
      str name=fieldValuexxx/str
      result name=collapsedDocs numFound=4 start=0
        doc
          long name=id12/long
          str name=contentaaa aaa aaa v/str
          str name=colxxx/str
        /doc
      /result
    /lst
  /lst
 /lst

 Is this the way the response looks like when you use teh patch?
 Thanks in advance


 Martijn v Groningen wrote:

 Hi Marc,

 I'm not sure if I follow you completely, but the example you gave is
 not complete. I'm missing a few tags in your example. Lets assume the
 following response that the latest patches produce.

 lst name=collapse_counts
     str name=fieldcat/str
     lst name=results
         lst name=009
             str name=fieldValuehard/str
             int name=collapseCount1/int
             result name=collapsedDocs numFound=1 start=0
                  doc
                     long name=id008/long
                     str name=contentaaa aaa/str
                     str name=colccc/str
                  /doc
             /result
         /lst
         ...
     /lst
 /lst

 The result list contains collapse groups. The name of the child
 elements are the collapse head ids. Everything that falls under the
 collapse head belongs to that collapse group and thus adding document
 head id to the field value is unnecessary.  In the above example
 document with id 009 is the document head of document with id 008.
 Document with id 009 should be displayed in the search result.

 From what you have said, it seems that you properly configured the
 patch.

 Martijn

 2009/12/7 Marc Sturlese marc.sturl...@gmail.com:

 Hey there, I have beeb testing the last patch and I think or I am
 missing
 something or the way to show the collapsed documents when adjacent
 collapse
 can be sometimes confusing:
 I am using the patch replacing queryComponent for collapseComponent
 (not
 using both at same

Re: [jira] Issue Comment Edited: (SOLR-236) Field collapsing

The last two parameters are not necessary, since they default both to
true. Could you run the field collapse tests tests successful?

2009/12/7 Marc Sturlese marc.sturl...@gmail.com:

 The request I am sending is:
 http://localhost:8983/solr/select/?q=aaaversion=2.2start=0rows=20indent=oncollapse.field=colcollapse.includeCollapsedDocs.fl=*collapse.type=adjacentcollapse.info.doc=truecollapse.info.count=true

 I search for 'aaa' in the content field. All the documents in the result
 contain that string in the field content

 Martijn v Groningen wrote:

 Yes it should look similar to that. What is the exact request you send to
 Solr?
 Also to check if the patch works correctly can you run: ant clean test
 There are a number of tests that test the Field collapse functionality.

 Martijn


 2009/12/7 Marc Sturlese marc.sturl...@gmail.com:

lst name=collapse_counts
   str name=fieldcat/str
    lst name=results
        lst name=009
            str name=fieldValuehard/str
           int name=collapseCount1/int
            result name=collapsedDocs numFound=1 start=0
                 doc
                    long name=id008/long
                    str name=contentaaa aaa/str
                    str name=colccc/str
                 /doc
            /result
        /lst
        ...
    /lst
/lst
 I see, looks like I am applying the patch wrongly somehow.
 This the complete collapse_counts response I am getting:
 lst name=collapse_counts
  str name=fieldcol/str
  lst name=results
    lst
      int name=collapseCount1/int
      int name=collapseCount1/int
      int name=collapseCount1/int
      str name=fieldValuebbb/str
      str name=fieldValueccc/str
      str name=fieldValuexxx/str
      result name=collapsedDocs numFound=1 start=0
        doc
          long name=id2/long
          str name=contentaaa aaa/str
          str name=colbbb/str
        /doc
      /result
      result name=collapsedDocs numFound=1 start=0
        doc
          long name=id8/long
          str name=contentaaa aaa aaa sd/str
          str name=colccc/str
       /doc
      /result
      result name=collapsedDocs numFound=4 start=0
        doc
          long name=id12/long
          str name=contentaaa aaa aaa v/str
          str name=colxxx/str
        /doc
      /result
    /lst
  /lst
 /lst

 As you can see I am getting a lst tag with no name. As I understood
 what
 you told me. I should be getting as many lst tags as collapsed groups and
 the name attribute of the lst should be the unique field value. So, if
 the
 patch was applyed correcly teh response should look like:

 lst name=collapse_counts
  str name=fieldcol/str
  lst name=results
    lst name=354 (the head value of the collapsed group)
      int name=collapseCount1/int
      str name=fieldValuebbb/str
      result name=collapsedDocs numFound=1 start=0
        doc
          long name=id2/long
          str name=contentaaa aaa/str
          str name=colbbb/str
        /doc
      /result
    /lst
    lst name=654
      int name=collapseCount1/int
      str name=fieldValueccc/str
      result name=collapsedDocs numFound=1 start=0
        doc
          long name=id8/long
          str name=contentaaa aaa aaa sd/str
          str name=colccc/str
       /doc
      /result
    /lst
    lst name=654
      int name=collapseCount1/int
      str name=fieldValuexxx/str
      result name=collapsedDocs numFound=4 start=0
        doc
          long name=id12/long
          str name=contentaaa aaa aaa v/str
          str name=colxxx/str
        /doc
      /result
    /lst
  /lst
 /lst

 Is this the way the response looks like when you use teh patch?
 Thanks in advance


 Martijn v Groningen wrote:

 Hi Marc,

 I'm not sure if I follow you completely, but the example you gave is
 not complete. I'm missing a few tags in your example. Lets assume the
 following response that the latest patches produce.

 lst name=collapse_counts
     str name=fieldcat/str
     lst name=results
         lst name=009
             str name=fieldValuehard/str
             int name=collapseCount1/int
             result name=collapsedDocs numFound=1 start=0
                  doc
                     long name=id008/long
                     str name=contentaaa aaa/str
                     str name=colccc/str
                  /doc
             /result
         /lst
         ...
     /lst
 /lst

 The result list contains collapse groups. The name of the child
 elements are the collapse head ids. Everything that falls under the
 collapse head belongs to that collapse group and thus adding document
 head id to the field value is unnecessary.  In the above example
 document with id 009 is the document head of document with id 008.
 Document with id 009 should be displayed in the search result.

 From what you have said, it seems that you properly configured the
 patch.

 Martijn

 2009/12/7 Marc Sturlese marc.sturl...@gmail.com:

 Hey there, I have beeb testing the last patch and I think or I am
 missing
 something or the

[jira] Updated: (SOLR-1621) Allow current single core deployments to be specified by solr.xml


 [ 
https://issues.apache.org/jira/browse/SOLR-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-1621:
-

Attachment: SOLR-1621.patch

the index pages are fixed

 Allow current single core deployments to be specified by solr.xml
 -

 Key: SOLR-1621
 URL: https://issues.apache.org/jira/browse/SOLR-1621
 Project: Solr
  Issue Type: New Feature
Affects Versions: 1.5
Reporter: Noble Paul
 Fix For: 1.5

 Attachments: SOLR-1621.patch, SOLR-1621.patch, SOLR-1621.patch, 
 SOLR-1621.patch, SOLR-1621.patch


 supporting two different modes of deployments is turning out to be hard. This 
 leads to duplication of code. Moreover there is a lot of confusion on where 
 do we put common configuration. See the mail thread 
 http://markmail.org/message/3m3rqvp2ckausjnf

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1553) extended dismax query parser

[
https://issues.apache.org/jira/browse/SOLR-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787021#action_12787021
]

Hoss Man commented on SOLR-1553:

Thoughts while reading the code...

* the code is kind of hard to read ... there's a serious dirth of comments
* reads very kludgy, clearly a hacked up version of DisMax ... probably want to
refactor some helper functions (that can then be documented)
* the clause.field and getFieldName functionality is dangerous for people
migrating from edismax-dismax (users guessing field names can query on fields
the solr admin doesn't want them to query on) ... we need an option to turn
that off.
** one really nice thing about the field query support though: it looks like it
would really be easy to add support for arbitrary field name aliasing with
something like f.someFieldAlias.qf=realFieldA^3+realFieldB^4
** perhaps getFieldName should only work for fields explicitly enumerated in a
param?
* why is TO listed as an operator when building up the phrase boost fields?
(line 296) ... if range queries are supported, then shouldn't the upper/lower
bounds also be striped out of the clauses list?
** accepting range queries also seems like something that people should be able
to disable
* apparently pf was changed to iteratively build boosting phrase queries for
every 'pair' of words, and pf3 is a new param to build boosting phrase
queries for every 'triple' of words in the input. while this certainly seems
useful, it's not back-compatable .. why not restore 'pf' to it's original
purpose, and add pf2 for hte pairs?
* what is the motivation for ExtendedSolrQueryParser.makeDismax? ... i see that
the boost queries built from the pf and pf3 fields are put in BooleanQueries
instead of DisjunctionMaxQueries ... but why? (if the user searches for a
phrase that's common in many fields of one document, that document is going to
get a huge score boost regardless of the tie value, which kind of defeats the
point of what the dismax parser is trying to do)
* we should remove the extremely legacy /* legacy logic */ for dealing with
bq ... almost no one should care about that, we really don't need to carry it
forward in a new parser.
* there are a lot of empty catch blocks that seem like they should at least log
a warning or debug message.
* ExtendedAnalyzer feels like a really big hack ... i'm not certain, but i
don't think it works correctly if a CharFilter is declared.
* we need to document all these new params (pf3, lowercaseOperators,
boost,

Thoughts while testing it out on some really hairy edge cases that break the
old dismax parser...

* this is really cool
* this is really freaking cool.
* still has a problem with search strings like foo and foo || ... i
suspect it would be an easy fix to recognize these just like AND/OR are
recognized and escaped.
* once we fix some of hte issues mentioned above, we should absolutely register
this using the name dismax by default, and register the old one as
oldDismax with a note in CHANGES.txt telling people to use defType=oldDismax
if they really need it.

extended dismax query parser

Key: SOLR-1553
URL: https://issues.apache.org/jira/browse/SOLR-1553
Project: Solr
Issue Type: New Feature
Reporter: Yonik Seeley
Fix For: 1.5

Attachments: SOLR-1553.patch

An improved user-facing query parser based on dismax

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1607) use a proper key other than IndexReader for ExternalFileField and QueryElevationCompenent to work properly when reopenReaders is set to true


[ 
https://issues.apache.org/jira/browse/SOLR-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787024#action_12787024
 ] 

Hoss Man commented on SOLR-1607:


I'm not too familiar with the internals of ExternalFileField and 
QueryElevationCompenent, but if the caching is already broken because of 
reopen, then now is probably a good time to try to gut their one-off caches and 
replace them with uses of SolrCache -- that way we can have regenerators for 
them to autowarm on newSearcher.

(ExternalFileField would probably be pretty hard to make work like this 
however, because of the way schema.xml resources are isolated from the SolrCore)

 use a proper key other than IndexReader for ExternalFileField and 
 QueryElevationCompenent to work properly when reopenReaders is set to true
 

 Key: SOLR-1607
 URL: https://issues.apache.org/jira/browse/SOLR-1607
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 1.4
Reporter: Koji Sekiguchi
Assignee: Koji Sekiguchi
Priority: Minor
 Fix For: 1.5


 As introducing reopenReaders feature in 1.4, this prevent reload 
 external_[fieldname] and elevate.xml files in dataDir when commit is 
 submitted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: [jira] Issue Comment Edited: (SOLR-236) Field collapsing

Yes, I can reproduce the same situation here. I will update the patch
asap and add it to Jira.

Martijn

2009/12/7 Marc Sturlese marc.sturl...@gmail.com:

 Hey! Got it working!
 The problem was that my uniqueField is indexed as long and it's not suported
 by the patch.
 The value is obtained in getCollapseGroupResult function in
 AbstarctCollapseCollector.java as:

 String schemaId = searcher.doc(docId).get(uniqueIdFieldname);

 To suport long,int,slong,sint,float,sfloat...
 It should be obtaining doing somenthing like:

 FieldType idFieldType =
 searcher.getSchema().getFieldType(uniqueIdFieldname);
 String schemaId = ;
 Fieldable name_field = null;
 try {
      name_field = searcher.doc(id).getFieldable(uniqueIdFieldname);
 } catch (IOException ex) {
      //deal with exception
 }
 if (name_field != null) {
   schemaId = idFieldType.storedToReadable(name_field);
 }


 Martijn v Groningen wrote:

 The last two parameters are not necessary, since they default both to
 true. Could you run the field collapse tests tests successful?

 2009/12/7 Marc Sturlese marc.sturl...@gmail.com:

 The request I am sending is:
 http://localhost:8983/solr/select/?q=aaaversion=2.2start=0rows=20indent=oncollapse.field=colcollapse.includeCollapsedDocs.fl=*collapse.type=adjacentcollapse.info.doc=truecollapse.info.count=true

 I search for 'aaa' in the content field. All the documents in the result
 contain that string in the field content

 Martijn v Groningen wrote:

 Yes it should look similar to that. What is the exact request you send
 to
 Solr?
 Also to check if the patch works correctly can you run: ant clean test
 There are a number of tests that test the Field collapse functionality.

 Martijn


 2009/12/7 Marc Sturlese marc.sturl...@gmail.com:

lst name=collapse_counts
   str name=fieldcat/str
    lst name=results
        lst name=009
            str name=fieldValuehard/str
           int name=collapseCount1/int
            result name=collapsedDocs numFound=1 start=0
                 doc
                    long name=id008/long
                    str name=contentaaa aaa/str
                    str name=colccc/str
                 /doc
            /result
        /lst
        ...
    /lst
/lst
 I see, looks like I am applying the patch wrongly somehow.
 This the complete collapse_counts response I am getting:
 lst name=collapse_counts
  str name=fieldcol/str
  lst name=results
    lst
      int name=collapseCount1/int
      int name=collapseCount1/int
      int name=collapseCount1/int
      str name=fieldValuebbb/str
      str name=fieldValueccc/str
      str name=fieldValuexxx/str
      result name=collapsedDocs numFound=1 start=0
        doc
          long name=id2/long
          str name=contentaaa aaa/str
          str name=colbbb/str
        /doc
      /result
      result name=collapsedDocs numFound=1 start=0
        doc
          long name=id8/long
          str name=contentaaa aaa aaa sd/str
          str name=colccc/str
       /doc
      /result
      result name=collapsedDocs numFound=4 start=0
        doc
          long name=id12/long
          str name=contentaaa aaa aaa v/str
          str name=colxxx/str
        /doc
      /result
    /lst
  /lst
 /lst

 As you can see I am getting a lst tag with no name. As I understood
 what
 you told me. I should be getting as many lst tags as collapsed groups
 and
 the name attribute of the lst should be the unique field value. So, if
 the
 patch was applyed correcly teh response should look like:

 lst name=collapse_counts
  str name=fieldcol/str
  lst name=results
    lst name=354 (the head value of the collapsed group)
      int name=collapseCount1/int
      str name=fieldValuebbb/str
      result name=collapsedDocs numFound=1 start=0
        doc
          long name=id2/long
          str name=contentaaa aaa/str
          str name=colbbb/str
        /doc
      /result
    /lst
    lst name=654
      int name=collapseCount1/int
      str name=fieldValueccc/str
      result name=collapsedDocs numFound=1 start=0
        doc
          long name=id8/long
          str name=contentaaa aaa aaa sd/str
          str name=colccc/str
       /doc
      /result
    /lst
    lst name=654
      int name=collapseCount1/int
      str name=fieldValuexxx/str
      result name=collapsedDocs numFound=4 start=0
        doc
          long name=id12/long
          str name=contentaaa aaa aaa v/str
          str name=colxxx/str
        /doc
      /result
    /lst
  /lst
 /lst

 Is this the way the response looks like when you use teh patch?
 Thanks in advance


 Martijn v Groningen wrote:

 Hi Marc,

 I'm not sure if I follow you completely, but the example you gave is
 not complete. I'm missing a few tags in your example. Lets assume the
 following response that the latest patches produce.

 lst name=collapse_counts
     str name=fieldcat/str
     lst name=results
         lst name=009
             str name=fieldValuehard/str
             int name=collapseCount1/int
             result

[jira] Commented: (SOLR-1621) Allow current single core deployments to be specified by solr.xml


[ 
https://issues.apache.org/jira/browse/SOLR-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787031#action_12787031
 ] 

Mark Miller commented on SOLR-1621:
---

bq. the index pages are fixed

how are they fixed?

I think one problem with how you are handling alias' is that it only works with 
alias' defined in solr.xml?

If you use the request handler to create an alias (with the alias command), I 
still don't think that works properly.

 Allow current single core deployments to be specified by solr.xml
 -

 Key: SOLR-1621
 URL: https://issues.apache.org/jira/browse/SOLR-1621
 Project: Solr
  Issue Type: New Feature
Affects Versions: 1.5
Reporter: Noble Paul
 Fix For: 1.5

 Attachments: SOLR-1621.patch, SOLR-1621.patch, SOLR-1621.patch, 
 SOLR-1621.patch, SOLR-1621.patch


 supporting two different modes of deployments is turning out to be hard. This 
 leads to duplication of code. Moreover there is a lot of confusion on where 
 do we put common configuration. See the mail thread 
 http://markmail.org/message/3m3rqvp2ckausjnf

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1624) Highlighter bug with MultiValued field + TermPositions optimization

2009-12-07 Thread Yonik Seeley (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley resolved SOLR-1624.


   Resolution: Fixed
Fix Version/s: 1.5

Committed.  Thanks Chris!

 Highlighter bug with MultiValued field + TermPositions optimization
 ---

 Key: SOLR-1624
 URL: https://issues.apache.org/jira/browse/SOLR-1624
 Project: Solr
  Issue Type: Bug
  Components: highlighter
Affects Versions: 1.4
Reporter: Chris Harris
 Fix For: 1.5

 Attachments: SOLR-1624.patch


 When TermPositions are stored, then 
 DefaultSolrHighlighter.doHighlighting(DocList docs, Query query, 
 SolrQueryRequest req, String[] defaultFields) currently initializes tstream 
 only for the first value of a multi-valued field. (Subsequent times through 
 the loop reinitialization is preempted by tots being non-null.) This means 
 that the 2nd/3rd/etc. values are not considered for highlighting purposes, 
 resulting in missed highlights.
 I'm attaching a patch with a test case to demonstrate the problem 
 (testTermVecMultiValuedHighlight2), as well as a proposed fix. All 
 highlighter tests pass with this applied. The patch should apply cleanly 
 against the latest trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-139) Support updateable/modifiable documents

2009-12-07 Thread Mark Diggory (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787072#action_12787072
 ] 

Mark Diggory commented on SOLR-139:
---

I notice this is a very long lived issue and that it is marked for 1.5.  Are 
there outstanding issues or problems with its usage if I apply it to my 1.4 
source?

 Support updateable/modifiable documents
 ---

 Key: SOLR-139
 URL: https://issues.apache.org/jira/browse/SOLR-139
 Project: Solr
  Issue Type: New Feature
  Components: update
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.5

 Attachments: Eriks-ModifiableDocument.patch, 
 Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, 
 Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, 
 Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, 
 getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, 
 SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, 
 SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, 
 SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, 
 SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, 
 SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, 
 SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, 
 SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, 
 SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, 
 SOLR-269+139-ModifiableDocumentUpdateProcessor.patch


 It would be nice to be able to update some fields on a document without 
 having to insert the entire document.
 Given the way lucene is structured, (for now) one can only modify stored 
 fields.
 While we are at it, we can support incrementing an existing value - I think 
 this only makes sense for numbers.
 for background, see:
 http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1608) Make it easy to write distributed search test cases


 [ 
https://issues.apache.org/jira/browse/SOLR-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-1608:


Attachment: SOLR-1608.patch

Removed an extra log statement I had added for debugging.

I'll commit this shortly.

 Make it easy to write distributed search test cases
 ---

 Key: SOLR-1608
 URL: https://issues.apache.org/jira/browse/SOLR-1608
 Project: Solr
  Issue Type: Improvement
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1608.patch, SOLR-1608.patch, SOLR-1608.patch


 Extract base class from TestDistributedSearch to make it easier for people to 
 write test cases for distributed components.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1608) Make it easy to write distributed search test cases


 [ 
https://issues.apache.org/jira/browse/SOLR-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-1608.
-

Resolution: Fixed

Committed revision 888115.

 Make it easy to write distributed search test cases
 ---

 Key: SOLR-1608
 URL: https://issues.apache.org/jira/browse/SOLR-1608
 Project: Solr
  Issue Type: Improvement
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1608.patch, SOLR-1608.patch, SOLR-1608.patch


 Extract base class from TestDistributedSearch to make it easier for people to 
 write test cases for distributed components.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-785) Distributed SpellCheckComponent


 [ 
https://issues.apache.org/jira/browse/SOLR-785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-785:
---

Attachment: SOLR-785.patch

Updating for SOLR-1608 commit.

 Distributed SpellCheckComponent
 ---

 Key: SOLR-785
 URL: https://issues.apache.org/jira/browse/SOLR-785
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: SOLR-785.patch, SOLR-785.patch, SOLR-785.patch, 
 SOLR-785.patch, spelling-shard.patch


 Enhance the SpellCheckComponent to run in a distributed (sharded) environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (SOLR-1629) Return to admin page link on registry.jsp goes to wrong page


 [ 
https://issues.apache.org/jira/browse/SOLR-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar reassigned SOLR-1629:
---

Assignee: Shalin Shekhar Mangar

 Return to admin page link on registry.jsp goes to wrong page
 --

 Key: SOLR-1629
 URL: https://issues.apache.org/jira/browse/SOLR-1629
 Project: Solr
  Issue Type: Bug
  Components: web gui
Reporter: Michael Ryan
Assignee: Shalin Shekhar Mangar
Priority: Minor

 The Return to admin page link on admin/registry.jsp links to the current 
 page.
 http://svn.apache.org/viewvc/lucene/solr/trunk/src/webapp/web/admin/registry.xsl?revision=815587view=markup
 Change a href=Return to Admin Page/a to a href=.Return to Admin 
 Page/a.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (SOLR-1620) log message created null misleading


 [ 
https://issues.apache.org/jira/browse/SOLR-1620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar reassigned SOLR-1620:
---

Assignee: Shalin Shekhar Mangar

 log message created null misleading
 -

 Key: SOLR-1620
 URL: https://issues.apache.org/jira/browse/SOLR-1620
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: KuroSaka TeruHiko
Assignee: Shalin Shekhar Mangar
Priority: Minor
   Original Estimate: 0.5h
  Remaining Estimate: 0.5h

 Solr logs a message like this:
 {noformat}
 INFO: created null: org.apache.solr.analysis.LowerCaseFilterFactory
 {noformat}
 This sounds like the TokenFilter or Tokenizer were not created and a serious 
 error.  But it mealy means the component is not named.  null is printed 
 because the local variable name has the value null.
 This is misleading.
 If the text field type is not named, it should just print blank, rather than 
 the word null.
 I would suggest that a line in 
 src/java/org/apache/solr/util/plugin/AbstractPluginLoader.java be changed to:
 {noformat}
   log.info(created+((name!=null)?( +name):)+:  + 
 plugin.getClass().getName() );
 {noformat}
 from
 {noformat}
   log.info(created +name+:  + plugin.getClass().getName() );
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1616) JSON Response for Facets not properly formatted


 [ 
https://issues.apache.org/jira/browse/SOLR-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-1616.
-

Resolution: Won't Fix

Closing this per Yonik's comment.

 JSON Response for Facets not properly formatted
 ---

 Key: SOLR-1616
 URL: https://issues.apache.org/jira/browse/SOLR-1616
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Lou Sacco

 When making a SOLR search call with facets turned on, I notice that the 
 facets JSON string is not properly formatted using wt=json. 
 I would expect that there would be a bracketed array around each record 
 rather than running them all together.  This is very hard to read with ExtJS 
 as its JsonReader reads each element as its own record when the paired 
 records are meant to be together.
 Here's an example of the output I get:
 {code}
 facet_counts:{
  facet_queries:{},
  facet_fields:{
 deviceName:[
  x2,6,
  dd22,12,
  f12,1],
 devicePrgMgr:[
  alberto,80,
  anando,24,
  artus,101],
 portfolioName:[
  zztop,32],
 chipsetName:[
  fat,3,
  thin,2],
 {code}
 As an example, I would expect chipset family to be so that the JsonReader can 
 read each as record:
 {code}
 chipsetName:[
 [fat,3],
 [thin,2]
  ],
 {code}
 See [here|http://json.org/fatfree.html] for details on Json Arrays. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Best practices in Solr Schema Design

2009-12-07 Thread insaneyogi3008


Hello everybody,

I am trying to port data from a PostGRESql database instance onto Solr , I
am getting flummoxed in coming up with an efficient schema design . Could
somebody give me general pointers in how I can go about achieving this ? 

With Regards
Sri
-- 
View this message in context: 
http://old.nabble.com/Best-practices-in-Solr-Schema-Design-tp26684128p26684128.html
Sent from the Solr - Dev mailing list archive at Nabble.com.

Best practices in Solr Schema Design

2009-12-07 Thread insaneyogi3008


Hello everybody,

I am trying to port data from a PostGRESql database instance onto Solr , I
am getting flummoxed in coming up with an efficient schema design . Could
somebody give me general pointers in how I can go about achieving this ? 

With Regards
Sri
-- 
View this message in context: 
http://old.nabble.com/Best-practices-in-Solr-Schema-Design-tp26684130p26684130.html
Sent from the Solr - Dev mailing list archive at Nabble.com.

[jira] Resolved: (SOLR-343) Constraining date facets by facet.mincount


 [ 
https://issues.apache.org/jira/browse/SOLR-343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-343.
---

   Resolution: Fixed
Fix Version/s: 1.5
 Assignee: Hoss Man

Patch looks good, test looks good.

thanks guys!

 Constraining date facets by facet.mincount
 --

 Key: SOLR-343
 URL: https://issues.apache.org/jira/browse/SOLR-343
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.2
 Environment: Solr 1.2+
Reporter: Raiko Eckstein
Assignee: Hoss Man
Priority: Minor
 Fix For: 1.5

 Attachments: DateFacetsMincountPatch.patch, SOLR-343.patch


 It would be helpful to allow the facet.mincount parameter to work with date 
 facets, i.e. constraining the results so that it would be possible to filter 
 out date ranges in the results where no documents occur from the server-side. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: Build Solr index using Hadoop MapReduce

2009-12-07 Thread JerylCook


 Build Solr index using Hadoop MapReduce
http://issues.apache.org/jira/browse/SOLR-1045


Ning Li-3 wrote:
 
 SOLR-1045 it is. More details will be available in that issue.
 
 Marc, you can check out Hadoop contrib/index which builds a Lucene
 index using Hadoop MapReduce. However, it does not handle duplicate
 detection.
 
 Cheers,
 Ning
 
 
 On Mon, Mar 2, 2009 at 4:25 PM, Marc Sturlese marc.sturl...@gmail.com
 wrote:

 I am doing some research about creating lucene/solr index using hadoop
 but
 there's not so much info around, would be great to see some code!!! (I am
 experiencing problems specially in duplication detection)
 Thanks

 Shalin Shekhar Mangar wrote:

 On Mon, Mar 2, 2009 at 11:24 PM, Ning Li ning.li...@gmail.com wrote:

 Hi,

 I wonder if there is interest in a contrib module that builds Solr
 index using Hadoop MapReduce?


 Absolutely!


 It is different from the Solr support in Nutch. The Solr support in
 Nutch sends a document to a Solr server in a reduce task. Here, I aim
 at building/updating Solr index within map/reduce tasks. Also, it
 achieves better parallelism when the number of map tasks is greater
 than the number of reduce tasks, which is usually the case.

 I worked out a very simple initial version. But I want to check if
 there is any interest before proceeding. If so, I'll open a Jira
 issue.


 +1

 Please do. It'd be great to see this in Solr.

 --
 Regards,
 Shalin Shekhar Mangar.



 --
 View this message in context:
 http://www.nabble.com/Build-Solr-index-using-Hadoop-MapReduce-tp22293172p22296832.html
 Sent from the Solr - Dev mailing list archive at Nabble.com.


 
 

-- 
View this message in context: 
http://old.nabble.com/Build-Solr-index-using-Hadoop-MapReduce-tp22293172p26684154.html
Sent from the Solr - Dev mailing list archive at Nabble.com.

[jira] Commented: (SOLR-1586) Create Spatial Point FieldTypes


[ 
https://issues.apache.org/jira/browse/SOLR-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787113#action_12787113
 ] 

Grant Ingersoll commented on SOLR-1586:
---

FYI, see the SOLR-1131 for an implementation of a Point Field Type.

 Create Spatial Point FieldTypes
 ---

 Key: SOLR-1586
 URL: https://issues.apache.org/jira/browse/SOLR-1586
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 1.5

 Attachments: examplegeopointdoc.patch.txt, 
 SOLR-1586.Mattmann.112209.geopointonly.patch.txt, 
 SOLR-1586.Mattmann.112209.geopointonly.patch.txt, 
 SOLR-1586.Mattmann.112409.geopointandgeohash.patch.txt, 
 SOLR-1586.Mattmann.112409.geopointandgeohash.patch.txt, 
 SOLR-1586.Mattmann.112509.geopointandgeohash.patch.txt


 Per SOLR-773, create field types that hid the details of creating tiers, 
 geohash and lat/lon fields.
 Fields should take in lat/lon points in a single form, as in:
 field name=foolat lon/field

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1586) Create Spatial Point FieldTypes


[ 
https://issues.apache.org/jira/browse/SOLR-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787115#action_12787115
 ] 

Grant Ingersoll commented on SOLR-1586:
---

bq. we should have the ability to output those fields as georss per ryan's 
suggestion

Ryan can correct me if I am putting words in his mouth, but I don't think he 
literally meant we needed to use those exact tags.  I think he just meant the 
format of the actual values.

 Create Spatial Point FieldTypes
 ---

 Key: SOLR-1586
 URL: https://issues.apache.org/jira/browse/SOLR-1586
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 1.5

 Attachments: examplegeopointdoc.patch.txt, 
 SOLR-1586.Mattmann.112209.geopointonly.patch.txt, 
 SOLR-1586.Mattmann.112209.geopointonly.patch.txt, 
 SOLR-1586.Mattmann.112409.geopointandgeohash.patch.txt, 
 SOLR-1586.Mattmann.112409.geopointandgeohash.patch.txt, 
 SOLR-1586.Mattmann.112509.geopointandgeohash.patch.txt


 Per SOLR-773, create field types that hid the details of creating tiers, 
 geohash and lat/lon fields.
 Fields should take in lat/lon points in a single form, as in:
 field name=foolat lon/field

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1586) Create Spatial Point FieldTypes

[
https://issues.apache.org/jira/browse/SOLR-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787121#action_12787121
]

Chris A. Mattmann commented on SOLR-1586:
-

Hey Grant:

bq. Ryan can correct me if I am putting words in his mouth, but I don't think
he literally meant we needed to use those exact tags. I think he just meant the
format of the actual values.

Ah no worries -- I think it would be a nice feature to actual output using
those exact tags. That's the point of a standard, right? With the tags comes
namespacing and all that good stuff, which I believe to be important.

Also, since XmlWriter is even more flexible per SOLR-1592, then I see no reason
not to use those tags in the output?

Cheers,
Chris

Create Spatial Point FieldTypes
---

Key: SOLR-1586
URL: https://issues.apache.org/jira/browse/SOLR-1586
Project: Solr
Issue Type: Improvement
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
Fix For: 1.5

Attachments: examplegeopointdoc.patch.txt,
SOLR-1586.Mattmann.112209.geopointonly.patch.txt,
SOLR-1586.Mattmann.112209.geopointonly.patch.txt,
SOLR-1586.Mattmann.112409.geopointandgeohash.patch.txt,
SOLR-1586.Mattmann.112409.geopointandgeohash.patch.txt,
SOLR-1586.Mattmann.112509.geopointandgeohash.patch.txt

Per SOLR-773, create field types that hid the details of creating tiers,
geohash and lat/lon fields.
Fields should take in lat/lon points in a single form, as in:
field name=foolat lon/field

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1586) Create Spatial Point FieldTypes


[ 
https://issues.apache.org/jira/browse/SOLR-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787150#action_12787150
 ] 

Chris A. Mattmann commented on SOLR-1586:
-

bq. FYI, see the SOLR-1131 for an implementation of a Point Field Type. 

Sure, I'll take a look @ it and try to bring this patch up to speed w.r.t to 
that. Independently though, the geohash implementation i put up should be good 
to go right now. Please take a look and let me know if you are +1 to commit. I 
included an example doc to test it out with.

Cheers,
Chris


 Create Spatial Point FieldTypes
 ---

 Key: SOLR-1586
 URL: https://issues.apache.org/jira/browse/SOLR-1586
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 1.5

 Attachments: examplegeopointdoc.patch.txt, 
 SOLR-1586.Mattmann.112209.geopointonly.patch.txt, 
 SOLR-1586.Mattmann.112209.geopointonly.patch.txt, 
 SOLR-1586.Mattmann.112409.geopointandgeohash.patch.txt, 
 SOLR-1586.Mattmann.112409.geopointandgeohash.patch.txt, 
 SOLR-1586.Mattmann.112509.geopointandgeohash.patch.txt


 Per SOLR-773, create field types that hid the details of creating tiers, 
 geohash and lat/lon fields.
 Fields should take in lat/lon points in a single form, as in:
 field name=foolat lon/field

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-433) MultiCore and SpellChecker replication

2009-12-07 Thread Jason Rutherglen (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787155#action_12787155
]

Jason Rutherglen commented on SOLR-433:
---

Are the existing patches for multiple cores or only for spellchecking?

MultiCore and SpellChecker replication
--

Key: SOLR-433
URL: https://issues.apache.org/jira/browse/SOLR-433
Project: Solr
Issue Type: Improvement
Components: replication (scripts), spellchecker
Affects Versions: 1.3
Reporter: Otis Gospodnetic
Fix For: 1.5

Attachments: RunExecutableListener.patch, SOLR-433-r698590.patch,
SOLR-433.patch, SOLR-433.patch, SOLR-433.patch, SOLR-433.patch,
solr-433.patch, SOLR-433_unified.patch, spellindexfix.patch

With MultiCore functionality coming along, it looks like we'll need to be
able to:
A) snapshot each core's index directory, and
B) replicate any and all cores' complete data directories, not just their
index directories.
Pulled from the spellchecker and multi-core index replication thread -
http://markmail.org/message/pj2rjzegifd6zm7m
Otis:
I think that makes sense - distribute everything for a given core, not just
its index. And the spellchecker could then also have its data dir (and only
index/ underneath really) and be replicated in the same fashion.
Right?
Ryan:
Yes, that was my thought. If an arbitrary directory could be distributed,
then you could have
/path/to/dist/index/...
/path/to/dist/spelling-index/...
/path/to/dist/foo
and that would all get put into a snapshot. This would also let you put
multiple cores within a single distribution:
/path/to/dist/core0/index/...
/path/to/dist/core0/spelling-index/...
/path/to/dist/core0/foo
/path/to/dist/core1/index/...
/path/to/dist/core1/spelling-index/...
/path/to/dist/core1/foo

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

RE: SOLR-1131 - Multiple Fields per Field Type

2009-12-07 Thread Chris Hostetter


: fieldType name=latlon type=LatLonFieldType pattern=location__* /
: fieldType name=latlon_home type=LatLonFieldType 
pattern=location_home_*/
: fieldType name=latlon_work type=LatLonFieldType 
pattern=location_home_*/
: 
: field name=location type=latlon/
: field name=location_home type=latlon_home/
: field name=location_work type=latlon_work/

I'm not really understanding the value of an approach like that.  for 
starters, what Lucene field names would ultimately be created in those 
examples?  And if i also added...

 field name=other_location type=latlon/
 dynamicField name=*_dynamic_location type=latlon/

...then what field names would be created under the covers?

: I think it makes more sense to define the heterogeneity at the fieldType 
level because:
: 
: (a) it's a bit more consistent with the existing solr schema examples, 
: where the difference between many of the field types (e.g., ints and 
: tints, which are both solr.TrieIntField's, date and tdate, both 
: instances of solr.TrieDateField, with different configuration, etc.)
: 
: (b) isolation of change: fieldType defs will change less often than 
: field defs, where names and indexed/stored/etc. debugging are likely 
: to occur more frequently

...this just feels wrong to me ... i can't really explain why.  It seems 
like you are suggesting thatt every field/ declaration would need a one 
to one corrispondence with a unique fieldType/ declaration in order to 
prevent field name collisions, which sounds sketchy enough ... but i'm 
also not fond of the idea that a person editing the schema can't just look 
at the field/ and dynamicField/ names to ensure that they understand 
what underlying fields are being created (so they don't inadvertantly add 
a new one that collides) ... now they also have to look at the pattern 
attribute of every fieldType/ that is a poly field.

letting dynamicField/ drive everything just seems a *lot* simpler ... 
both as far as implementation, and as far as maintaining the schema.

: I don't think the above hybrid approach will lead to anything other than 
: confusion, as you indicated above. Let's stick to the pattern defs at 
: the fieldType level, and then let the fieldType handle the internal 
: dynamicity with e.g., a dynamicField, and then notify the schema user 

From the standpoint of reading a schema.xml file, the approach you're 
describing of a pattern attribute on fieldType/ declarations actaully 
seems more confusing then the strawman suggestion i made of a pattern 
attribute on field ... even without understanding what concrete feilds 
you are suggesting would be created with a configuration like that, it 
still increases the number of places you have to look to see what field 
names are getting created.


-Hoss

Re: SOLR-1131 - Multiple Fields per Field Type

2009-12-07 Thread Chris Hostetter

: I'm not sure if you worry about it.  But I'd argue it isn't natural 
: anyway.  You would do the following instead, which is how any address 
: book I've ever seen works:
: field name=home type=LatLonFT/
: field name=work type=LatLonFT/

...the home vs work distinction was arbitrary.  the point is what if 
i want to support an arbitrary number of distinct values in a PolyField? 
... with your approach any attempt to search for people near X would 
require me to search for work near X or home near X ... which is analogous 
to oneof hte main purposes of multivalued fields: so i don't have to 
uniquely name every Field instance.  I might have a thousand unique 
(but unamed) locations that i want to associate with a document, and i 
want to search for documents with a location near X ... likewise i might 
have thousands unique polygons associated with a document and i want to 
search for documents where one or more polygons overlap with an input 
polygon (ie: island nations overlapping with the flight path of an 
airplane).

The question is: how can/would PolyFields deal with input like this? .. 
we've discussed cardniality in the number of fields produced by a single 
input value, but we haven't really discussed cardinality in the number of 
input values.

: So, maybe the FT can explicitly prohibit multivalued?  But, I suppose 
: you could do the position thing, too.  This could be achieved through a 
: new SpanQuery pretty easily:  SpanPositionQuery that takes in a term and 
: a specific position.  Trivial to write, I think, just not sure if it is 
: generally useful.  Although, I must say I've been noodling around with 

The problem is how do you let the PolyField specify the position when 
indexing?  the last API i saw fleshed out in this discussion didn't give 
the PolyField any information about how many input values were in any 
given doc, it just allowed PolyFields to be String=Field[] black boxes 
(as opposed to the String=Field[] black box FieldTYpes must currently 
be).

We can't assume even basic lastPostion+1 type logic for these 
polyfields, because differnet input values might produce Filed arrays 
containing different quantities of fields, with differnet names.  if a 
CartiesienPolyFieldType can get away with only using the grid_level1 and 
grid_level2 fields for one input value, but other input values require 
using grid_level2, grid_level2, and grid_level3, then simple position 
increments aren't enough if a document has multiple values (some of which 
need 2 different Field names, and others that need 3)


-Hoss

Re: SOLR-1131 - Multiple Fields per Field Type

2009-12-07 Thread Grant Ingersoll


On Dec 7, 2009, at 5:59 PM, Chris Hostetter wrote:

 
 : fieldType name=latlon type=LatLonFieldType pattern=location__* /
 : fieldType name=latlon_home type=LatLonFieldType 
 pattern=location_home_*/
 : fieldType name=latlon_work type=LatLonFieldType 
 pattern=location_home_*/
 : 
 : field name=location type=latlon/
 : field name=location_home type=latlon_home/
 : field name=location_work type=latlon_work/
 
 I'm not really understanding the value of an approach like that.  for 
 starters, what Lucene field names would ultimately be created in those 
 examples?  And if i also added...


Have a look at the patch I put up today.  I think it is going to work quite 
well, but that could be jet-lag induced delirium at this point.

For a field type:
fieldType name=point type=solr.PointType dimension=2 
subFieldType=double/

and a field declared as:
field name=home type=point indexed=true stored=true/

And a new document of:
doc
field name=point39.0 -79.434/field
/doc

There are three fields created:
home --  Contains the stored value
home___0 - Contains 39.0 indexed as a double (as in the double FieldType, not 
just a double precision)
home___1 - Contains -79.434 as a double 



 
 field name=other_location type=latlon/
 dynamicField name=*_dynamic_location type=latlon/
 
 ...then what field names would be created under the covers?
 
 : I think it makes more sense to define the heterogeneity at the fieldType 
 level because:
 : 
 : (a) it's a bit more consistent with the existing solr schema examples, 
 : where the difference between many of the field types (e.g., ints and 
 : tints, which are both solr.TrieIntField's, date and tdate, both 
 : instances of solr.TrieDateField, with different configuration, etc.)
 : 
 : (b) isolation of change: fieldType defs will change less often than 
 : field defs, where names and indexed/stored/etc. debugging are likely 
 : to occur more frequently
 
 ...this just feels wrong to me ... i can't really explain why.  It seems 
 like you are suggesting thatt every field/ declaration would need a one 
 to one corrispondence with a unique fieldType/ declaration in order to 
 prevent field name collisions, which sounds sketchy enough ... but i'm 
 also not fond of the idea that a person editing the schema can't just look 
 at the field/ and dynamicField/ names to ensure that they understand 
 what underlying fields are being created (so they don't inadvertantly add 
 a new one that collides) ... now they also have to look at the pattern 
 attribute of every fieldType/ that is a poly field.
 
 letting dynamicField/ drive everything just seems a *lot* simpler ... 
 both as far as implementation, and as far as maintaining the schema.

I don't agree.  It requires more configuration and more knowledge by the end 
user and doesn't hid the details.

Re: SOLR-1131 - Multiple Fields per Field Type

2009-12-07 Thread Grant Ingersoll


On Dec 7, 2009, at 6:13 PM, Chris Hostetter wrote:

 : I'm not sure if you worry about it.  But I'd argue it isn't natural 
 : anyway.  You would do the following instead, which is how any address 
 : book I've ever seen works:
 : field name=home type=LatLonFT/
 : field name=work type=LatLonFT/
 
 ...the home vs work distinction was arbitrary.  the point is what if 
 i want to support an arbitrary number of distinct values in a PolyField? 

This is the beauty of Yonik's addition of getFieldQuery() to the FieldType.  
The FieldType will be aware of the arbitrariness.  Furthermore, it can 
reflect on the index itself via IndexReader.getFieldNames() to determine the 
number of Fields that actually exist if it has to.  However, my guess is that 
in practice in most situations the FieldType author/user will have the info it 
needs.  Still, I think we can also evolve if we need to.

 ... with your approach any attempt to search for people near X would 
 require me to search for work near X or home near X ... which is analogous 
 to oneof hte main purposes of multivalued fields: so i don't have to 
 uniquely name every Field instance.  

Sure, but would you really ever model multiple locations like that in the same 
field?  I don't think in practice that you would, so I think it is a bit of a 
red herring.  Perhaps there is a different use case that better demonstrates it?

 I might have a thousand unique 
 (but unamed) locations that i want to associate with a document, and i 
 want to search for documents with a location near X ... likewise i might 
 have thousands unique polygons associated with a document and i want to 
 search for documents where one or more polygons overlap with an input 
 polygon (ie: island nations overlapping with the flight path of an 
 airplane).

I don't think this implementation precludes that.  The FunctionQueries only 
operating on a single valued field does, however.  Setting that aside, we could 
write a Query that does what you want, I think.

 
 The question is: how can/would PolyFields deal with input like this? .. 
 we've discussed cardniality in the number of fields produced by a single 
 input value, but we haven't really discussed cardinality in the number of 
 input values.

I'm not sure that it does, but I don't know that it needs to just yet.  This 
might be where an R-Tree implementation comes in handy, but I'll leave it to 
the geo-experts to discuss more.  

I also am not sure how the PolyField case is any different than the dynamic 
field case.  Either way, something needs to know the names of the fields that 
were created.


 
 : So, maybe the FT can explicitly prohibit multivalued?  But, I suppose 
 : you could do the position thing, too.  This could be achieved through a 
 : new SpanQuery pretty easily:  SpanPositionQuery that takes in a term and 
 : a specific position.  Trivial to write, I think, just not sure if it is 
 : generally useful.  Although, I must say I've been noodling around with 
 
 The problem is how do you let the PolyField specify the position when 
 indexing?  the last API i saw fleshed out in this discussion didn't give 
 the PolyField any information about how many input values were in any 
 given doc, it just allowed PolyFields to be String=Field[] black boxes 
 (as opposed to the String=Field[] black box FieldTYpes must currently 
 be).
 
 We can't assume even basic lastPostion+1 type logic for these 
 polyfields, because differnet input values might produce Filed arrays 
 containing different quantities of fields, with differnet names.  if a 
 CartiesienPolyFieldType can get away with only using the grid_level1 and 
 grid_level2 fields for one input value, but other input values require 
 using grid_level2, grid_level2, and grid_level3, then simple position 
 increments aren't enough if a document has multiple values (some of which 
 need 2 different Field names, and others that need 3)


That's not how the Cartesian Field stuff works, but I think I see what you are 
getting at and I would say I'm going to explicitly punt on that right now.  
Ultimately, I think when such a case comes up, the FieldType needs to be 
configured to be able to determine this information.

-Grant

Re: Solr Cell revamped as an UpdateProcessor?

2009-12-07 Thread Grant Ingersoll


On Dec 7, 2009, at 3:51 PM, Chris Hostetter wrote:

 
 ASs someone with very little knowledge of Solr Cell and/or Tika, I find 
 myself wondering if ExtractingRequestHandler would make more sense as an 
 extractingUpdateProcessor -- where it could be configured to take take either 
 binary fields (or string fields containing URLs) out of the Documents, parse 
 them with tika, and add the various XPath matching hunks of text back into 
 the document as new fields.
 
 Then ExtractingRequestHandler just becomes a handler that slurps up it's 
 ContentStreams and adds them as binary data fields and adds the other literal 
 params as fields.
 
 Wouldn't that make things like SOLR-1358, and using Tika with URLs/filepaths 
 in XML and CSV based updates fairly trivial?

It probably could, but am not sure how it works in a processor chain.  However, 
I'm not sure I understand how they work all that much either.  I also plan on 
adding, BTW, a SolrJ client for Tika that does the extraction on the client.  
In many cases, the ExtrReqHandler is really only designed for lighter weight 
extraction cases, as one would simply not want to send that much rich content 
over the wire.

Inconsistent Search Results for different flavors of same search term

2009-12-07 Thread insaneyogi3008


Hello, 

I was performing a search on different versions of the term San Jose on my
Solr Instance ,  the differing versions being :

san jose(all lowercase)
San jose(One uppercase) 
San Jose (Capital first letters)
SAN JOSE (ALL Caps)

each of these phrases return a different number of hits back as response
objects . for example

san jose returns - result name=response numFound=0 start=0
San jose returns -result name=response numFound=4 start=0
San Jose returns -result name=response numFound=16 start=0
SAN JOSE returns - result name=response numFound=853 start=0

How do I make my search not case sensitive?
-- 
View this message in context: 
http://old.nabble.com/Inconsistent-Search-Results-for-different-flavors-of-same-search-term-tp26686294p26686294.html
Sent from the Solr - Dev mailing list archive at Nabble.com.

[jira] Commented: (SOLR-1586) Create Spatial Point FieldTypes


[ 
https://issues.apache.org/jira/browse/SOLR-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787198#action_12787198
 ] 

Grant Ingersoll commented on SOLR-1586:
---

Can you put a patch containing just the geohash stuff?

 Create Spatial Point FieldTypes
 ---

 Key: SOLR-1586
 URL: https://issues.apache.org/jira/browse/SOLR-1586
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 1.5

 Attachments: examplegeopointdoc.patch.txt, 
 SOLR-1586.Mattmann.112209.geopointonly.patch.txt, 
 SOLR-1586.Mattmann.112209.geopointonly.patch.txt, 
 SOLR-1586.Mattmann.112409.geopointandgeohash.patch.txt, 
 SOLR-1586.Mattmann.112409.geopointandgeohash.patch.txt, 
 SOLR-1586.Mattmann.112509.geopointandgeohash.patch.txt


 Per SOLR-773, create field types that hid the details of creating tiers, 
 geohash and lat/lon fields.
 Fields should take in lat/lon points in a single form, as in:
 field name=foolat lon/field

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1277) Implement a Solr specific naming service (using Zookeeper)


 [ 
https://issues.apache.org/jira/browse/SOLR-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-1277:
--

Attachment: SOLR-1277.patch

Inching forward as we try and nail down the layout.

* moves the configs to /solr/configs/collection1 in the tests
* which config to load is discovered from
   /solr/collections/collection1/config=collection1
* system property for the name of the collection to work with
* consolidated zookeeper host and solr path sys properties into one ie 
localhost:2181/solr

I still expect everything in this patch to be very fluid and change as we move 
forward - but its something to give us a base to play with.

We should probably start a ZooKeeper branch since this issue is likely to get 
quite large and hopefully have many contributors - that model has worked quite 
well with the flexible indexing issue in Lucene, and I have gotten quite handy 
at quick merging from my practice there ;)

 Implement a Solr specific naming service (using Zookeeper)
 --

 Key: SOLR-1277
 URL: https://issues.apache.org/jira/browse/SOLR-1277
 Project: Solr
  Issue Type: New Feature
Affects Versions: 1.4
Reporter: Jason Rutherglen
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 1.5

 Attachments: log4j-1.2.15.jar, SOLR-1277.patch, SOLR-1277.patch, 
 SOLR-1277.patch, SOLR-1277.patch, zookeeper-3.2.1.jar

   Original Estimate: 672h
  Remaining Estimate: 672h

 The goal is to give Solr server clusters self-healing attributes
 where if a server fails, indexing and searching don't stop and
 all of the partitions remain searchable. For configuration, the
 ability to centrally deploy a new configuration without servers
 going offline.
 We can start with basic failover and start from there?
 Features:
 * Automatic failover (i.e. when a server fails, clients stop
 trying to index to or search it)
 * Centralized configuration management (i.e. new solrconfig.xml
 or schema.xml propagates to a live Solr cluster)
 * Optionally allow shards of a partition to be moved to another
 server (i.e. if a server gets hot, move the hot segments out to
 cooler servers). Ideally we'd have a way to detect hot segments
 and move them seamlessly. With NRT this becomes somewhat more
 difficult but not impossible?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1606) Integrate Near Realtime

2009-12-07 Thread Jason Rutherglen (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787206#action_12787206
 ] 

Jason Rutherglen commented on SOLR-1606:


Koji,

Looks like a change to trunk is causing the error, also when I step through it 
passes, when I run without stepping it fails...

 Integrate Near Realtime 
 

 Key: SOLR-1606
 URL: https://issues.apache.org/jira/browse/SOLR-1606
 Project: Solr
  Issue Type: Improvement
  Components: update
Affects Versions: 1.4
Reporter: Jason Rutherglen
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1606.patch


 We'll integrate IndexWriter.getReader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1606) Integrate Near Realtime


[ 
https://issues.apache.org/jira/browse/SOLR-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787208#action_12787208
 ] 

Mark Miller commented on SOLR-1606:
---

Don't we need a new command, like update_realtime (bad name i know) or 
something? Else you will be doing a full commit every time you get the new 
reader?

 Integrate Near Realtime 
 

 Key: SOLR-1606
 URL: https://issues.apache.org/jira/browse/SOLR-1606
 Project: Solr
  Issue Type: Improvement
  Components: update
Affects Versions: 1.4
Reporter: Jason Rutherglen
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1606.patch


 We'll integrate IndexWriter.getReader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-1606) Integrate Near Realtime


[ 
https://issues.apache.org/jira/browse/SOLR-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787208#action_12787208
 ] 

Mark Miller edited comment on SOLR-1606 at 12/8/09 12:13 AM:
-

Don't we need a new command, like update_realtime (bad name i know) or 
something? Else you will be doing a full commit every time you get the new 
reader?

*edit*

I see - you skip the commit - I think we should make a new command though 
shouldn't we?

Still allow a standard commit, but a new command that kicks in the realtime 
refresh?

  was (Author: markrmil...@gmail.com):
Don't we need a new command, like update_realtime (bad name i know) or 
something? Else you will be doing a full commit every time you get the new 
reader?
  
 Integrate Near Realtime 
 

 Key: SOLR-1606
 URL: https://issues.apache.org/jira/browse/SOLR-1606
 Project: Solr
  Issue Type: Improvement
  Components: update
Affects Versions: 1.4
Reporter: Jason Rutherglen
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1606.patch


 We'll integrate IndexWriter.getReader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-1631) NPE's reported from QueryComponent.mergeIds

NPE's reported from QueryComponent.mergeIds
---

 Key: SOLR-1631
 URL: https://issues.apache.org/jira/browse/SOLR-1631
 Project: Solr
  Issue Type: Bug
  Components: search
Reporter: Hoss Man


Multiple reports of QueryComponent.mergeIds occasionally throwing NPE...

http://markmail.org/message/aqzaaphbuow4sa5o
http://old.nabble.com/NullPointerException-thrown-during-updates-to-index-to26613309.html#a26613309


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1606) Integrate Near Realtime

2009-12-07 Thread Jason Rutherglen (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787221#action_12787221
 ] 

Jason Rutherglen commented on SOLR-1606:


bq. Don't we need a new command, like update_realtime

We could however it'd work the same as commit?  Meaning afterwards, all pending 
changes (including deletes) are available?  The commit command is fairly 
overloaded as is.  Are you thinking in terms of replication?

 Integrate Near Realtime 
 

 Key: SOLR-1606
 URL: https://issues.apache.org/jira/browse/SOLR-1606
 Project: Solr
  Issue Type: Improvement
  Components: update
Affects Versions: 1.4
Reporter: Jason Rutherglen
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1606.patch


 We'll integrate IndexWriter.getReader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: Inconsistent Search Results for different flavors of same search term

2009-12-07 Thread Erick Erickson

First, this is the devloper's list, I think this question
would be better suited to the user's list.

You get searches to be case insensitive by
indexing and searching with an analyzer that, say,
lowercases. If you post on the user's list, please
include the analyzer definitions for the fields in
question *and* your query. From your email, I
can't tell if, for instance, you're even searching
against the same field for both terms. i.e. if you're
searching something like title:san jose then san
would go against the title field while jose would go
against the default search field...

If you want to be really thorough, also post the results
of your query with debugQuery=on

Schema browser in your SOLR admin page might
help, and Luke can be used to examin what's actually in
your index.

Best
Erick

On Mon, Dec 7, 2009 at 6:36 PM, insaneyogi3008 insaney...@gmail.com wrote:

Hello,

I was performing a search on different versions of the term San Jose on
my
Solr Instance , the differing versions being :

san jose(all lowercase)
San jose(One uppercase)
San Jose (Capital first letters)
SAN JOSE (ALL Caps)

each of these phrases return a different number of hits back as response
objects . for example

san jose returns - result name=response numFound=0 start=0
San jose returns -result name=response numFound=4 start=0
San Jose returns -result name=response numFound=16 start=0
SAN JOSE returns - result name=response numFound=853 start=0

How do I make my search not case sensitive?
--
View this message in context:
http://old.nabble.com/Inconsistent-Search-Results-for-different-flavors-of-same-search-term-tp26686294p26686294.html
Sent from the Solr - Dev mailing list archive at Nabble.com.

Re: Inconsistent Search Results for different flavors of same search term

2009-12-07 Thread Tom Hill

Look at http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters.

But before you make changes, get familiar with the analysis section of the
admin interface:

http://localhost:8983/solr/admin/analysis.jsp?highlight=on

Of course, adjust the path for your server.

This will let you see what the analyzers are doing at index and query time,
and is VERY helpful in understanding the analysis process.

Tom


On Mon, Dec 7, 2009 at 3:36 PM, insaneyogi3008 insaney...@gmail.com wrote:


 Hello,

 I was performing a search on different versions of the term San Jose on
 my
 Solr Instance ,  the differing versions being :

 san jose(all lowercase)
 San jose(One uppercase)
 San Jose (Capital first letters)
 SAN JOSE (ALL Caps)

 each of these phrases return a different number of hits back as response
 objects . for example

 san jose returns - result name=response numFound=0 start=0
 San jose returns -result name=response numFound=4 start=0
 San Jose returns -result name=response numFound=16 start=0
 SAN JOSE returns - result name=response numFound=853 start=0

 How do I make my search not case sensitive?
 --
 View this message in context:
 http://old.nabble.com/Inconsistent-Search-Results-for-different-flavors-of-same-search-term-tp26686294p26686294.html
 Sent from the Solr - Dev mailing list archive at Nabble.com.

Re: Inconsistent Search Results for different flavors of same search term

2009-12-07 Thread Pradeep Pujari

I resolved this kind of situations by a) while indexing converted to lower case 
in DIH and also converting free text keywords to lowercase in the client code 
before sending it to Solr.

pradeep.

--- On Mon, 12/7/09, insaneyogi3008 insaney...@gmail.com wrote:

 From: insaneyogi3008 insaney...@gmail.com
 Subject: Inconsistent Search Results for different flavors of same search term
 To: solr-dev@lucene.apache.org
 Date: Monday, December 7, 2009, 3:36 PM
 
 Hello, 
 
 I was performing a search on different versions of the term
 San Jose on my
 Solr Instance ,  the differing versions being :
 
 san jose(all lowercase)
 San jose(One uppercase) 
 San Jose (Capital first letters)
 SAN JOSE (ALL Caps)
 
 each of these phrases return a different number of hits
 back as response
 objects . for example
 
 san jose returns - result name=response
 numFound=0 start=0
 San jose returns -result name=response
 numFound=4 start=0
 San Jose returns -result name=response
 numFound=16 start=0
 SAN JOSE returns - result name=response
 numFound=853 start=0
 
 How do I make my search not case sensitive?
 -- 
 View this message in context: 
 http://old.nabble.com/Inconsistent-Search-Results-for-different-flavors-of-same-search-term-tp26686294p26686294.html
 Sent from the Solr - Dev mailing list archive at
 Nabble.com.

Re: SOLR-1131 - Multiple Fields per Field Type

2009-12-07 Thread Mattmann, Chris A (388J)

Hi Hoss,

 
 : fieldType name=latlon type=LatLonFieldType pattern=location__* /
 : fieldType name=latlon_home type=LatLonFieldType
 pattern=location_home_*/
 : fieldType name=latlon_work type=LatLonFieldType
 pattern=location_home_*/
 :
 : field name=location type=latlon/
 : field name=location_home type=latlon_home/
 : field name=location_work type=latlon_work/
 
 I'm not really understanding the value of an approach like that.  for
 starters, what Lucene field names would ultimately be created in those
 examples?  

The first field would be named location__location.
The second field would be named location_home_location_home.
The third field would be named location_work_location_work.

 And if i also added...
 
  field name=other_location type=latlon/
  dynamicField name=*_dynamic_location type=latlon/
 
 ...then what field names would be created under the covers?
 

In general, it would be FieldType#getPattern().stripOffEndRegexStarStuff() +
Field#getName(). 

 : I think it makes more sense to define the heterogeneity at the fieldType
 level because:
 :
 : (a) it's a bit more consistent with the existing solr schema examples,
 : where the difference between many of the field types (e.g., ints and
 : tints, which are both solr.TrieIntField's, date and tdate, both
 : instances of solr.TrieDateField, with different configuration, etc.)
 :
 : (b) isolation of change: fieldType defs will change less often than
 : field defs, where names and indexed/stored/etc. debugging are likely
 : to occur more frequently
 
 ...this just feels wrong to me ... i can't really explain why.  It seems
 like you are suggesting thatt every field/ declaration would need a one
 to one corrispondence with a unique fieldType/ declaration in order to
 prevent field name collisions, which sounds sketchy enough ... but i'm
 also not fond of the idea that a person editing the schema can't just look
 at the field/ and dynamicField/ names to ensure that they understand
 what underlying fields are being created (so they don't inadvertantly add
 a new one that collides) ... now they also have to look at the pattern
 attribute of every fieldType/ that is a poly field.

Well if this feels wrong to you then I think the schema.xml file that ships
with SOLR should also feel wrong as well because it uses the exact same
pattern for defining field type variations. That is, differences between
FieldType representations for ints and tints are not stored as variations on
the SchemaField definition itself but they are stored as variation on the
FieldTypes (e.g., a different precisionStep in the case of int [0] versus
that of tint [8]). Based on what you are proposing, why isn't precisionStep
an attribute on field, rather than fieldType in those examples?

 
 letting dynamicField/ drive everything just seems a *lot* simpler ...
 both as far as implementation, and as far as maintaining the schema.

Possibly. It's also a lot less traceable. It's implicit versus explicit,
which I'm not sure leads to simplicity in the end.

 
 : I don't think the above hybrid approach will lead to anything other than
 : confusion, as you indicated above. Let's stick to the pattern defs at
 : the fieldType level, and then let the fieldType handle the internal
 : dynamicity with e.g., a dynamicField, and then notify the schema user
 
 From the standpoint of reading a schema.xml file, the approach you're
 describing of a pattern attribute on fieldType/ declarations actaully
 seems more confusing then the strawman suggestion i made of a pattern
 attribute on field ... even without understanding what concrete feilds
 you are suggesting would be created with a configuration like that, it
 still increases the number of places you have to look to see what field
 names are getting created.

How so? In actuality, it reduces it. Instead of having pattern definitions
on fields (which there is a greater chance of having more of), you have them
on field types?

Cheers,
Chris

++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.mattm...@jpl.nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department University of
Southern California, Los Angeles, CA 90089 USA
++

[jira] Updated: (SOLR-1586) Create Spatial Point FieldTypes


 [ 
https://issues.apache.org/jira/browse/SOLR-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris A. Mattmann updated SOLR-1586:


Attachment: SOLR-1586.Mattmann.120709.geohashonly.patch.txt

updated patch containing only the geohash goodies.

 Create Spatial Point FieldTypes
 ---

 Key: SOLR-1586
 URL: https://issues.apache.org/jira/browse/SOLR-1586
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 1.5

 Attachments: examplegeopointdoc.patch.txt, 
 SOLR-1586.Mattmann.112209.geopointonly.patch.txt, 
 SOLR-1586.Mattmann.112209.geopointonly.patch.txt, 
 SOLR-1586.Mattmann.112409.geopointandgeohash.patch.txt, 
 SOLR-1586.Mattmann.112409.geopointandgeohash.patch.txt, 
 SOLR-1586.Mattmann.112509.geopointandgeohash.patch.txt, 
 SOLR-1586.Mattmann.120709.geohashonly.patch.txt


 Per SOLR-773, create field types that hid the details of creating tiers, 
 geohash and lat/lon fields.
 Fields should take in lat/lon points in a single form, as in:
 field name=foolat lon/field

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-1632) Distributed IDF

2009-12-07 Thread Andrzej Bialecki (JIRA)

Distributed IDF
---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 


Distributed IDF is a valuable enhancement for distributed search across 
non-uniform shards. This issue tracks the proposed implementation of an API to 
support this functionality in Solr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1632) Distributed IDF

2009-12-07 Thread Andrzej Bialecki (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki  updated SOLR-1632:


Attachment: distrib.patch

Initial implementation. This supports the current global IDF (i.e. none ;) ), 
and an exact version of global IDF that requires one additional request per 
query to obtain per-shard stats.

The design should be already flexible enough to implement LRU caching of 
docFreqs, and ultimately to implement other methods for global IDF calculation 
(e.g. based on estimation or re-ranking).

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1621) Allow current single core deployments to be specified by solr.xml


[ 
https://issues.apache.org/jira/browse/SOLR-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787303#action_12787303
 ] 

Noble Paul commented on SOLR-1621:
--

bq.if they cause too much grief, we always have the option to remove.

Do we really have a usecase for ALIAS ? if there is no not compelling enough 
usecase we should consider removing it. there is a lot of code which is there 
just becaus eof this alias feature

 Allow current single core deployments to be specified by solr.xml
 -

 Key: SOLR-1621
 URL: https://issues.apache.org/jira/browse/SOLR-1621
 Project: Solr
  Issue Type: New Feature
Affects Versions: 1.5
Reporter: Noble Paul
 Fix For: 1.5

 Attachments: SOLR-1621.patch, SOLR-1621.patch, SOLR-1621.patch, 
 SOLR-1621.patch, SOLR-1621.patch


 supporting two different modes of deployments is turning out to be hard. This 
 leads to duplication of code. Moreover there is a lot of confusion on where 
 do we put common configuration. See the mail thread 
 http://markmail.org/message/3m3rqvp2ckausjnf

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-1621) Allow current single core deployments to be specified by solr.xml

[
https://issues.apache.org/jira/browse/SOLR-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787303#action_12787303
]

Noble Paul edited comment on SOLR-1621 at 12/8/09 4:29 AM:
---

bq.if they cause too much grief, we always have the option to remove.

Do we really have a usecase for ALIAS ? if there is no no compelling enough
usecase we should consider removing it. there is a lot of code which is there
just becaus eof this alias feature

was (Author: noble.paul):
bq.if they cause too much grief, we always have the option to remove.

Do we really have a usecase for ALIAS ? if there is no not compelling enough
usecase we should consider removing it. there is a lot of code which is there
just becaus eof this alias feature

Allow current single core deployments to be specified by solr.xml
-

Key: SOLR-1621
URL: https://issues.apache.org/jira/browse/SOLR-1621
Project: Solr
Issue Type: New Feature
Affects Versions: 1.5
Reporter: Noble Paul
Fix For: 1.5

Attachments: SOLR-1621.patch, SOLR-1621.patch, SOLR-1621.patch,
SOLR-1621.patch, SOLR-1621.patch

supporting two different modes of deployments is turning out to be hard. This
leads to duplication of code. Moreover there is a lot of confusion on where
do we put common configuration. See the mail thread
http://markmail.org/message/3m3rqvp2ckausjnf

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1621) Allow current single core deployments to be specified by solr.xml


 [ 
https://issues.apache.org/jira/browse/SOLR-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-1621:
-

Attachment: SOLR-1621.patch

it works now even after the alias command . I still think we should remove the 
'alias' command. It is a fancy feature which adds too much of complexity into 
ref counting of cores

 Allow current single core deployments to be specified by solr.xml
 -

 Key: SOLR-1621
 URL: https://issues.apache.org/jira/browse/SOLR-1621
 Project: Solr
  Issue Type: New Feature
Affects Versions: 1.5
Reporter: Noble Paul
 Fix For: 1.5

 Attachments: SOLR-1621.patch, SOLR-1621.patch, SOLR-1621.patch, 
 SOLR-1621.patch, SOLR-1621.patch, SOLR-1621.patch


 supporting two different modes of deployments is turning out to be hard. This 
 leads to duplication of code. Moreover there is a lot of confusion on where 
 do we put common configuration. See the mail thread 
 http://markmail.org/message/3m3rqvp2ckausjnf

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: Best practices in Solr Schema Design

2009-12-07 Thread Shalin Shekhar Mangar

Sri, please post your questions to solr-user list. This list is for Solr's
internal development discussions only.

On Tue, Dec 8, 2009 at 2:35 AM, insaneyogi3008 insaney...@gmail.com wrote:


 Hello everybody,

 I am trying to port data from a PostGRESql database instance onto Solr , I
 am getting flummoxed in coming up with an efficient schema design . Could
 somebody give me general pointers in how I can go about achieving this ?

 With Regards
 Sri
 --
 View this message in context:
 http://old.nabble.com/Best-practices-in-Solr-Schema-Design-tp26684128p26684128.html
 Sent from the Solr - Dev mailing list archive at Nabble.com.




-- 
Regards,
Shalin Shekhar Mangar.

[jira] Updated: (SOLR-1583) Create DataSources that return InputStream


 [ 
https://issues.apache.org/jira/browse/SOLR-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-1583:
-

Fix Version/s: 1.5

 Create DataSources that return InputStream
 --

 Key: SOLR-1583
 URL: https://issues.apache.org/jira/browse/SOLR-1583
 Project: Solr
  Issue Type: New Feature
  Components: contrib - DataImportHandler
Reporter: Noble Paul
Assignee: Noble Paul
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1583.patch


 Tika integration means the source has to be binary that is the DataSource 
 must be of type DataSourceInputStream . All the DataSourceReader should 
 have a binary counterpart.
 * BinURLDataSourceInputStream
 * BinContentStreamDataSourceInputStream
 * BinFileDataOurceInputStream

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1583) Create DataSources that return InputStream


 [ 
https://issues.apache.org/jira/browse/SOLR-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul resolved SOLR-1583.
--

Resolution: Fixed

committed r888277

 Create DataSources that return InputStream
 --

 Key: SOLR-1583
 URL: https://issues.apache.org/jira/browse/SOLR-1583
 Project: Solr
  Issue Type: New Feature
  Components: contrib - DataImportHandler
Reporter: Noble Paul
Assignee: Noble Paul
Priority: Minor
 Attachments: SOLR-1583.patch


 Tika integration means the source has to be binary that is the DataSource 
 must be of type DataSourceInputStream . All the DataSourceReader should 
 have a binary counterpart.
 * BinURLDataSourceInputStream
 * BinContentStreamDataSourceInputStream
 * BinFileDataOurceInputStream

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1358) Integration of Tika and DataImportHandler


 [ 
https://issues.apache.org/jira/browse/SOLR-1358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-1358:
-

Summary: Integration of Tika and DataImportHandler  (was: Integration of 
Solr Cell and DataImportHandler)

 Integration of Tika and DataImportHandler
 -

 Key: SOLR-1358
 URL: https://issues.apache.org/jira/browse/SOLR-1358
 Project: Solr
  Issue Type: New Feature
  Components: contrib - DataImportHandler
Reporter: Sascha Szott

 At the moment, it's impossible to configure Solr such that it build up 
 documents by using data that comes from both pdf documents and database table 
 columns. Currently, to accomplish this task, it's up to the user to add some 
 preprocessing that converts pdf files into plain text files. Therefore, I 
 would like to see an integration of Solr Cell into DIH that makes those 
 preprocessing obsolete.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (SOLR-1358) Integration of Tika and DataImportHandler


 [ 
https://issues.apache.org/jira/browse/SOLR-1358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul reassigned SOLR-1358:


Assignee: Noble Paul

 Integration of Tika and DataImportHandler
 -

 Key: SOLR-1358
 URL: https://issues.apache.org/jira/browse/SOLR-1358
 Project: Solr
  Issue Type: New Feature
  Components: contrib - DataImportHandler
Reporter: Sascha Szott
Assignee: Noble Paul

 At the moment, it's impossible to configure Solr such that it build up 
 documents by using data that comes from both pdf documents and database table 
 columns. Currently, to accomplish this task, it's up to the user to add some 
 preprocessing that converts pdf files into plain text files. Therefore, I 
 would like to see an integration of Solr Cell into DIH that makes those 
 preprocessing obsolete.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-1358) Integration of Tika and DataImportHandler

[
https://issues.apache.org/jira/browse/SOLR-1358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12750855#action_12750855
]

Noble Paul edited comment on SOLR-1358 at 12/8/09 6:36 AM:
---

Let us provide a new TikaEntityProcessor

{code:xml}
dataConfig
!-- use any of type DataSourceInputStream --
dataSource type=BinURLDataSource/
document
entity processor=TikaEntityProcessor tikaConfig=tikaconfig.xml
url=${some.var.goes.here}
/entity
document
/dataConfig
{code}

This most likely would need a BinUrlDataSource/BinContentStreamDataSource
because Tika uses binary inputs.

My suggestion is that TikaEntityProcessor live in the extraction contrib so
that managing dependencies is easier. But we will have to make extraction have
a compile-time dependency on DIH.

Grant , what do you think?

was (Author: noble.paul):
Let us provide a new TikaEntityProcessor

{code:xml}
entity processor=TikaEntityProcessor tikaConfig=tikaconfig.xml
url=${some.var.goes.here}
/entity
{code}

This most likely would need a BinUrlDataSource/BinContentStreamDataSource
because Tika uses binary inputs.

My suggestion is that TikaEntityProcessor live in the extraction contrib so
that managing dependencies is easier. But we will have to make extraction have
a compile-time dependency on DIH.

Grant , what do you think?

Integration of Tika and DataImportHandler
-

Key: SOLR-1358
URL: https://issues.apache.org/jira/browse/SOLR-1358
Project: Solr
Issue Type: New Feature
Components: contrib - DataImportHandler
Reporter: Sascha Szott
Assignee: Noble Paul

At the moment, it's impossible to configure Solr such that it build up
documents by using data that comes from both pdf documents and database table
columns. Currently, to accomplish this task, it's up to the user to add some
preprocessing that converts pdf files into plain text files. Therefore, I
would like to see an integration of Solr Cell into DIH that makes those
preprocessing obsolete.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1629) Return to admin page link on registry.jsp goes to wrong page


 [ 
https://issues.apache.org/jira/browse/SOLR-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-1629.
-

   Resolution: Fixed
Fix Version/s: 1.5

Committed revision 888281.

Thanks Michael!

 Return to admin page link on registry.jsp goes to wrong page
 --

 Key: SOLR-1629
 URL: https://issues.apache.org/jira/browse/SOLR-1629
 Project: Solr
  Issue Type: Bug
  Components: web gui
Reporter: Michael Ryan
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.5


 The Return to admin page link on admin/registry.jsp links to the current 
 page.
 http://svn.apache.org/viewvc/lucene/solr/trunk/src/webapp/web/admin/registry.xsl?revision=815587view=markup
 Change a href=Return to Admin Page/a to a href=.Return to Admin 
 Page/a.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1620) log message created null misleading