[jira] Updated: (SOLR-494) LukeRequestHandler/Ajax-based schema explorer

2008-03-20 Thread Greg Ludington (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Ludington updated SOLR-494:


Attachment: multicoreupdate.patch

In a multicore setting, these changes cause the LukeRequestHandler to throw an 
Exception on the core-identifying field because there was not a null check for 
sfield in the appropriate new line of LukeRequestHandler This patch adds this 
check, and also updates the javascript.

> LukeRequestHandler/Ajax-based schema explorer
> -
>
> Key: SOLR-494
> URL: https://issues.apache.org/jira/browse/SOLR-494
> Project: Solr
>  Issue Type: New Feature
>  Components: web gui
> Environment: N/A
>    Reporter: Greg Ludington
>Priority: Minor
> Fix For: 1.3
>
> Attachments: Field View.jpg, jsonschemabrowser.patch, 
> multicoreupdate.patch
>
>
> This patch submits a schema browsing tool based on making Ajax calls to 
> LukeRequestHandler.  It is in progress, but far enough along to generate 
> discussion and see if people find it useful/perhaps incorporate some 
> feedback.  It is similar to the XSLT-based schema browser in SOLR-75, in that 
> it provides cross-referenced exploring of the major schema components 
> (fields/field types/dynamic fields).  Since LukeRequestHandler provides more 
> information, this version can provide more information than could the XSLT 
> version, including statsitics and more information about dynamic fields.  
> Also, since it hits LukeRequestHandler, it probably also has much different 
> performance that just transforming schema.xml.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Reopened: (SOLR-494) LukeRequestHandler/Ajax-based schema explorer

2008-03-20 Thread Greg Ludington (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Ludington reopened SOLR-494:
-


I finally had occasion to look at this in a multicore setting, and the extra 
core-identifying field caused the schema browser problems, and, more 
importantly, caused an Exception to be thrown in the LukeRequestHandler when 
trying to output schema information for that extra multicore field.  The 
upcoming patch adds the necessary check to LukeRequestHandler, and adjusts the 
javascript in schema.jsp

> LukeRequestHandler/Ajax-based schema explorer
> -
>
> Key: SOLR-494
> URL: https://issues.apache.org/jira/browse/SOLR-494
> Project: Solr
>  Issue Type: New Feature
>  Components: web gui
> Environment: N/A
>    Reporter: Greg Ludington
>Priority: Minor
> Fix For: 1.3
>
> Attachments: Field View.jpg, jsonschemabrowser.patch
>
>
> This patch submits a schema browsing tool based on making Ajax calls to 
> LukeRequestHandler.  It is in progress, but far enough along to generate 
> discussion and see if people find it useful/perhaps incorporate some 
> feedback.  It is similar to the XSLT-based schema browser in SOLR-75, in that 
> it provides cross-referenced exploring of the major schema components 
> (fields/field types/dynamic fields).  Since LukeRequestHandler provides more 
> information, this version can provide more information than could the XSLT 
> version, including statsitics and more information about dynamic fields.  
> Also, since it hits LukeRequestHandler, it probably also has much different 
> performance that just transforming schema.xml.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [jira] Commented: (SOLR-494) LukeRequestHandler/Ajax-based schema explorer

2008-03-17 Thread Greg Ludington
It appears the problem is in the addition of a List of Maps to the
LukeRequestHandler output.  This results in XML like the following:


  

  synonyms.txt
  true
  true

org.apache.solr.analysis.SynonymFilterFactory
  
  ...


and that empty  seems to be causing the problems, as solrj wants
a name attribute there.

When I change that List to a NamedList containing
two Map, the XML becomes:

  

  stopwords.txt
  true

org.apache.solr.analysis.StopFilterFactory
  
  ...
 

I can make this change easily enough, and the schema browser will
still work, and the tests will pass.  Is this a combination that solrj
should be able to handle, though, and should there be a separate issue
filed, or did I create a response format combination that should not
be allowed?

On a related note, it also appears that the css from the initial patch
was not added, so the schema browser may be hard to read until it is.


[jira] Commented: (SOLR-494) LukeRequestHandler/Ajax-based schema explorer

2008-03-17 Thread Greg Ludington (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579614#action_12579614
 ] 

Greg Ludington commented on SOLR-494:
-

The LukeRequestHandler output (and IndexSchema) was changed to provide
a bit more information to cross reference fields, field types, and
dynamic fields.  These additions allow the user to browse through the
relationships between fields/types, hopefully to get a more complete
picture of the schema.

1) In the default no-argument LukeRequestHandler output, dynamic
fields are outputted with a reference to the dynamicField used to
generate them.  In the example schema.conf, the field
"incubationdate_dt", would contain this extra child in the XML
response:

*_dt

2) In the show=schema view, dynamicField definitions are also
outputted.  In the XML response, this would be:


  random
  I-S--
  
  


3) In that schema view, fields reference the fields they are copied
from, or copied to.  The text field in the example schema would look
like the following:


  
org.apache.solr.schema.SchemaField:cat{type=text_ws,properties=indexed,tokenized,stored,omitNorms,termVectors,multiValued}
  
...and so on for each field


3) In the show=schema view, FieldTypes also output the dynamicFields
that use their definitions.  In the example schema, sdouble has no
fields, and so does not show up.  After this patch, it shows up as
follows, because there is a dynamicField available of that type:


  
*_d
  
  false
  org.apache.solr.schema.SortableDoubleField
  
org.apache.solr.schema.FieldType$DefaultAnalyzer
  
  
org.apache.solr.schema.FieldType$DefaultAnalyzer
  


4) Again in the show=schema view, there is some addition information
about the analyzers.  Each Field is output with its
positionIncrementGap, and each FieldType is output with its tokenizers
and filters.  This FieldType snippet is long, but it appears the solrj
issue is here:


org.apache.solr.analysis.TokenizerChain


org.apache.solr.analysis.WhitespaceTokenizerFactory 





synonyms.txt
false
true

org.apache.solr.analysis.SynonymFilterFactory



stopwords.txt
true

org.apache.solr.analysis.StopFilterFactory



0
1
0
0
1


org.apache.solr.analysis.WordDelimiterFilterFactory 



org.apache.solr.analysis.LowerCaseFilterFactory



protwords.txt


org.apache.solr.analysis.EnglishPorterFilterFactory 




org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory






I had not been looking at the solrj effects yet, but it the failure is
in the way filters and analyzers are output in the show=schema view
(or how they are parsed in solrj).  I will try and make some time to
look at this tonight, but I would be not able to look at other client
implementations.


> LukeRequestHandler/Ajax-based schema explorer
> -
>
> Key: SOLR-494
> URL: https://issues.apache.org/jira/browse/SOLR-494
> Project: Solr
>  Issue Type: New Feature
>  Components: web gui
> Environment: N/A
>Reporter: Greg Ludington
>Priority: Minor
> Fix For: 1.3
>
> Attachments: Field View.jpg, jsonschemabrowser.patch
>
>
> This patch submits a schema browsing tool based on making Ajax calls to 
> LukeRequestHandler.  It is in progress, but far enough along to generate 
> discussion and see if people find it useful/perhaps incorporate some 
> feedback.  It is similar to the XSLT-based schema browser in SOLR-75, in that 
> it provides cross-referenced exploring of the major schema components 
> (fields/field types/dynamic fields).  Since LukeRequestHandler provides more 
> information, this version can provide more information than could the XSLT 
> version, including statsitics and more information about dynamic fields.  
> Also, since it hits LukeRequestHandler, it probably also has much different 
> performance that just transforming schema.xml.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-494) LukeRequestHandler/Ajax-based schema explorer

2008-03-04 Thread Greg Ludington (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Ludington updated SOLR-494:


Attachment: Field View.jpg

Screen shot of the basic view of the "text" field from the example schema, as 
viewed  in Firefox 2.  Shows links to field type, the fields it is copied from, 
index/query analyzers, as well as the histogram and the top terms form.

> LukeRequestHandler/Ajax-based schema explorer
> -
>
> Key: SOLR-494
> URL: https://issues.apache.org/jira/browse/SOLR-494
> Project: Solr
>  Issue Type: New Feature
>  Components: web gui
> Environment: N/A
>Reporter: Greg Ludington
>Priority: Minor
> Fix For: 1.3
>
> Attachments: Field View.jpg, jsonschemabrowser.patch
>
>
> This patch submits a schema browsing tool based on making Ajax calls to 
> LukeRequestHandler.  It is in progress, but far enough along to generate 
> discussion and see if people find it useful/perhaps incorporate some 
> feedback.  It is similar to the XSLT-based schema browser in SOLR-75, in that 
> it provides cross-referenced exploring of the major schema components 
> (fields/field types/dynamic fields).  Since LukeRequestHandler provides more 
> information, this version can provide more information than could the XSLT 
> version, including statsitics and more information about dynamic fields.  
> Also, since it hits LukeRequestHandler, it probably also has much different 
> performance that just transforming schema.xml.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-494) LukeRequestHandler/Ajax-based schema explorer

2008-03-02 Thread Greg Ludington (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Ludington updated SOLR-494:


Attachment: jsonschemabrowser.patch

This patch consists of 5 files:

1) Changes to IndexSchema to expose more information for cross referencing -- 
the source fields for a copyField, as well the prototypes for each DynamicField

2) Changes to LukeRequestHandler to pass this additional information (copyField 
sources and destinations, as well as analyzer information, and dynamic field 
information.)

3) Changes to solr-admin.css for the new page (adding new styles, not changing 
any existing ones)

4) A javascript-heavy schema.jsp to retrieve this information and present it in 
a browsable form

5) The inclusion of jquery as a foundation for the javascript in schema.jsp

It is the last two parts that could be a concern for committers.  jquery is 
dual-licensed under the GPL and under the MIT license, which I believe is 
ASF-compatible, but I have not checked the contribution checkbox until I know 
for sure.  Similarly, schema.jsp itself is heavily dependent on javascript that 
the project may or may not wish to maintain as versions change.

The page is also not set up to degrade gracefully.  Normally, I would consider 
that a large faux pas, but I am creating this as an internal aid, where 
graceful degradation will not be an issue, so I have not had the itch to redo 
this server-side.  It may be an issue in the larger context of being included 
in Solr, as, while it provides a few more ways to look at the schema than the 
XML/XSL LukeRequestHandler, it will not work across as many clients.  As a 
result, I did not include any direct link to it from any of the stock admin 
jsps, so you would have to hit

(your path)/admin/schema.jsp

directly in order to try it out.  I have tried it in several different browsers 
against my own small (single core) indexes, but I would be interested in 
feedback on how well it works for large indexes or indexes with large numbers 
of field definitions.

> LukeRequestHandler/Ajax-based schema explorer
> -
>
> Key: SOLR-494
> URL: https://issues.apache.org/jira/browse/SOLR-494
> Project: Solr
>  Issue Type: New Feature
>  Components: web gui
> Environment: N/A
>    Reporter: Greg Ludington
>Priority: Minor
> Fix For: 1.3
>
> Attachments: jsonschemabrowser.patch
>
>
> This patch submits a schema browsing tool based on making Ajax calls to 
> LukeRequestHandler.  It is in progress, but far enough along to generate 
> discussion and see if people find it useful/perhaps incorporate some 
> feedback.  It is similar to the XSLT-based schema browser in SOLR-75, in that 
> it provides cross-referenced exploring of the major schema components 
> (fields/field types/dynamic fields).  Since LukeRequestHandler provides more 
> information, this version can provide more information than could the XSLT 
> version, including statsitics and more information about dynamic fields.  
> Also, since it hits LukeRequestHandler, it probably also has much different 
> performance that just transforming schema.xml.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-494) LukeRequestHandler/Ajax-based schema explorer

2008-03-02 Thread Greg Ludington (JIRA)
LukeRequestHandler/Ajax-based schema explorer
-

 Key: SOLR-494
 URL: https://issues.apache.org/jira/browse/SOLR-494
 Project: Solr
  Issue Type: New Feature
  Components: web gui
 Environment: N/A
Reporter: Greg Ludington
Priority: Minor
 Fix For: 1.3


This patch submits a schema browsing tool based on making Ajax calls to 
LukeRequestHandler.  It is in progress, but far enough along to generate 
discussion and see if people find it useful/perhaps incorporate some feedback.  
It is similar to the XSLT-based schema browser in SOLR-75, in that it provides 
cross-referenced exploring of the major schema components (fields/field 
types/dynamic fields).  Since LukeRequestHandler provides more information, 
this version can provide more information than could the XSLT version, 
including statsitics and more information about dynamic fields.  Also, since it 
hits LukeRequestHandler, it probably also has much different performance that 
just transforming schema.xml.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [jira] Commented: (SOLR-258) Date based Facets

2007-07-30 Thread Greg Ludington
I started looking through this, and it looks very nice, though I do
see one slight nit to pick.  I may be reading this incorrectly, but
two parameters in rangeCount appear to be transposed.  In
SimpleFacets.java, the rangeCount method uses:

new ConstantScoreRangeQuery(field,low,high,iHigh,iLow)

but the Lucene javadocs suggest it is actually

new ConstantScoreRangeQuery(field,low,high,iLow,iHigh)

The iLow and iHigh parameters seem to be reversed.

Thanks,
Greg


[jira] Updated: (SOLR-181) Support for "Required" field Property

2007-03-03 Thread Greg Ludington (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Ludington updated SOLR-181:


Attachment: solr-181-required-fields.patch

This patch allows a per-field "required" property that can be used to require 
clients to include certain fields to submit to the index.

With this in the index:
  

A document submitted without a name field will throw a SolrException in 
DocumentBuilder.getDoc() resulting in this sort of message being sent back to 
the client:

org.apache.solr.core.SolrException: missing required fields:
name

Issues with the patch as submitted:

1) The rule is enforced, and the Exception thrown, from 
DocumentBuilder.getDoc().  Since this area may change in SOLR-139, this patch 
may need to be adjusted depending on the final result of SOLR-139.
2) Fields with defaultValues are implicitly required, though currently this 
patch does *not* automatically make the uniqueKey field required.  It may make 
sense to do this; however, there seems to be some debate on the mailing lists 
about this, so It is commented out with a //TODO for now.  See SOLR-172.
3) The RequiredFieldsTest case uses its own schema file, as otherwise all other 
tests would have to be retrofitted to add these required fields in their 
submissions, and all future test writers would have to keep this in mind, as 
well.

> Support for "Required" field Property
> -
>
> Key: SOLR-181
> URL: https://issues.apache.org/jira/browse/SOLR-181
> Project: Solr
>  Issue Type: Improvement
>  Components: update
>Reporter: Greg Ludington
>Priority: Minor
> Attachments: solr-181-required-fields.patch
>
>
> In certain situations, it can be helpful to require every document in your 
> index has a value for a given field.  While ideally the indexing client(s) 
> should be responsible enough to add all necessary fields, this patch allows 
> it to be enforced in the Solr schema, by adding a required property to a 
> field entry.  For example, with this in the schema:
> required="true"/>
> A request to index a document without a name field will result in this 
> response:
> org.apache.solr.core.SolrException: missing required 
> fields: name 
> (and then, of course, the stack trace)
> 
> The meat of this patch is that DocumentBuilder.getDoc() throws a 
> SolrException if not all required fields have values; this may not work well 
> as is with SOLR-139, Support updateable/modifiable documents, and may have to 
> be changed depending on that issue's final disposition.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-181) Support for "Required" field Property

2007-03-03 Thread Greg Ludington (JIRA)
Support for "Required" field Property
-

 Key: SOLR-181
 URL: https://issues.apache.org/jira/browse/SOLR-181
 Project: Solr
  Issue Type: Improvement
  Components: update
Reporter: Greg Ludington
Priority: Minor


In certain situations, it can be helpful to require every document in your 
index has a value for a given field.  While ideally the indexing client(s) 
should be responsible enough to add all necessary fields, this patch allows it 
to be enforced in the Solr schema, by adding a required property to a field 
entry.  For example, with this in the schema:

   

A request to index a document without a name field will result in this response:

org.apache.solr.core.SolrException: missing required fields: 
name 
(and then, of course, the stack trace)


The meat of this patch is that DocumentBuilder.getDoc() throws a SolrException 
if not all required fields have values; this may not work well as is with 
SOLR-139, Support updateable/modifiable documents, and may have to be changed 
depending on that issue's final disposition.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [jira] Commented: (SOLR-75) XSLT-based "Schema Browser" in admin view

2006-12-11 Thread Greg Ludington

I attached an updated sample to the ticket that has some inheritance
support.  Rather than a patch, it is a zip file that you should be
able to unzip and double-click the schema.xml (in IE or Firefox, at
least) to view the transformed result.  It does show inheritance or
overriding of attributes -- although only the "weight" field has both
-- but it does not expect analyzers or any child node to be overriden,
only attributes.  Let me know if this assumption is correct, as well
as any other thoughts on the sample.

Thanks,
Greg

On 12/11/06, Greg Ludington <[EMAIL PROTECTED]> wrote:

> : What about creating an xml report (using a the live index searcher)
> : and transforming that with XSLT to add look&feel?
>
> yeah ... i think you've really got something totally usable as is right
> now, os don't feel like you have to start over.  when i first typed up

I do not have any real preference about starting over -- it would
probably take the same amount of time to figure out the ugly parts of
the XSL as it would to just do it all as straight JSP.  As you both
have suggested, I can generate intermediate XML from the IndexSchema
to avoid doing the really ugly things in XSLT (which, of course, would
also take some time).  However, I was thinking more in terms of
trade-offs on future work:

Pro-XSLT: Using the Config utility methods, very easy for somebody to
swap in their own version in $SOLR/conf/ in order to get their own
look and feel.

Pro-JSP: Easier for maintenance, and more approachable for new
contributors, as JSP tends to be less impenetrable than XSLT.

I should be able to finish it either way; however, I have an emerging
crisis in my day job, so I may not be able to respond for a few days.

Thanks,
Greg



[jira] Updated: (SOLR-75) XSLT-based "Schema Browser" in admin view

2006-12-11 Thread Greg Ludington (JIRA)
 [ http://issues.apache.org/jira/browse/SOLR-75?page=all ]

Greg Ludington updated SOLR-75:
---

Attachment: schemav2sample.zip

An updated sample, including feedback about fieldtype->field inheritance.  It 
presumes that:

a) a field can inherit any attribute from a field type
b) a field cannot override analyzers (or any other child node), only attributes.

> XSLT-based "Schema Browser" in admin view
> -
>
> Key: SOLR-75
> URL: http://issues.apache.org/jira/browse/SOLR-75
> Project: Solr
>  Issue Type: New Feature
>  Components: web gui
> Environment: any
>Reporter: Greg Ludington
>Priority: Minor
> Attachments: closed.gif, open.gif, schemav2sample.zip, solr75v1.diff
>
>
> The files in this upcoming patch create a simple "schema browser" for the 
> admin tool.  It serves schema.xml along with a stylesheet that in compliant 
> browsers creates a page with a tree widget to show the fieldtypes and fields, 
> as well as their uses and cross references.  This is similar to the 
> schemaxsl.zip originally attached to SOLR-58, but a few features have been 
> added, and the look and feel has been changed to fit in better with the rest 
> of the admin tool.
> Note that it does *not* work against the live IndexSchema -- it merely 
> transforms schema.xml.  There is probably not a significant difference now, 
> but it is worth raising the issue in case there are future administration 
> capabilities in mind (i.e. on 
> http://wiki.apache.org/solr/MakeSolrMoreSelfService ) that might require a 
> schema browser to be talking to the live values.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: [jira] Commented: (SOLR-75) XSLT-based "Schema Browser" in admin view

2006-12-11 Thread Greg Ludington

: What about creating an xml report (using a the live index searcher)
: and transforming that with XSLT to add look&feel?

yeah ... i think you've really got something totally usable as is right
now, os don't feel like you have to start over.  when i first typed up


I do not have any real preference about starting over -- it would
probably take the same amount of time to figure out the ugly parts of
the XSL as it would to just do it all as straight JSP.  As you both
have suggested, I can generate intermediate XML from the IndexSchema
to avoid doing the really ugly things in XSLT (which, of course, would
also take some time).  However, I was thinking more in terms of
trade-offs on future work:

Pro-XSLT: Using the Config utility methods, very easy for somebody to
swap in their own version in $SOLR/conf/ in order to get their own
look and feel.

Pro-JSP: Easier for maintenance, and more approachable for new
contributors, as JSP tends to be less impenetrable than XSLT.

I should be able to finish it either way; however, I have an emerging
crisis in my day job, so I may not be able to respond for a few days.

Thanks,
Greg


Re: [jira] Commented: (SOLR-75) XSLT-based "Schema Browser" in admin view

2006-12-11 Thread Greg Ludington

Yes, it makes more sense, but I might go back to the drawing board for
a bit -- this sort of inheritance gets ugly in XSL.  Assuming there
are the appropriate public methods, it might be better to work
directly against the live IndexSchema rather than by XSLT transforming
the schema.xml file. The drawback, of course, is that you cannot swap
in your own XSLT to make your own look and feel, which seems to be the
direction many people want to go with the admin tools.  Any opinions
as to how important that ability is?

-Greg

On 12/8/06, Chris Hostetter <[EMAIL PROTECTED]> wrote:


: What do you mean by inheritng/overriding?  Granted, I am only working
: with the example schema here, but I do not see any similarities
: between the attributes of a field, and the fieldtype.  For fieldtypes,

Doh! ... you're right ... there aren't any examples of what i'm talking
about in the sample schema.xml.

Basically, any "core" boolean attribute that can be set on a  can
be set on a , by default a field inherits all of it's
attributes from the  it uses, this is touched on briefly in the
SchemaXml wiki page...

Individual fields can override the various options (indexed, stored,
etc...) that they inherit from their fieldtype.

(it just so happens that the CNET schema we used as a template for hte
orriginal example schema.xml didn't excercise this feature)


looking at the code, the only core attribute of a  that can't
be overridden by a  is "positionIncrementGap" (and i think that has
more to do with it being a numeric attribute then anything else. If you
define a new custom FieldType with custom attributes, those wouldn't
overridable by the  either.

I'll make a note on the TaskList that we should both document this a
little better (and add some examples to the schema.xml)


  ...does what i was suggesting earlier make more sense now?


-Hoss




Re: [jira] Commented: (SOLR-75) XSLT-based "Schema Browser" in admin view

2006-12-08 Thread Greg Ludington

Taking this off the list for a moment, because I may be a bit obtuse here:


(And, apparently so obtuse I negelcted to change the to: field.  Such
things happen late after a launch night.)


Re: [jira] Commented: (SOLR-75) XSLT-based "Schema Browser" in admin view

2006-12-08 Thread Greg Ludington

i think it would definitely be helpful ... showing the inherited
attributes "inline" is really what would make using the schema
browser worth while (as opposed to just reading hte XML directly) ... it
saves the confusion of looking at the field, then clicking over to the
fieldtype to see what it inherits, then noticing something set on the
fieldtype and trying to remember if the field overrides it so you go back
...etc.


Taking this off the list for a moment, because I may be a bit obtuse here:

What do you mean by inheritng/overriding?  Granted, I am only working
with the example schema here, but I do not see any similarities
between the attributes of a field, and the fieldtype.  For fieldtypes,
I see attributes like name, class, and sortMissingLast (as well as
child analyzers), whereas for fields, I see name, type, indexed,
stored, and multiValued.  I do not see any overlap in attribute names
that suggest some manner of per-field override of fieldtype
definitions.  Am I missing something crucial here, or when you speak
of overriding, do you just mean you want to see the separate fieldtype
attributes on the same screen as each field of that type, for
convenience sake?

Thanks,
Greg


Re: [jira] Commented: (SOLR-75) XSLT-based "Schema Browser" in admin view

2006-12-07 Thread Greg Ludington

yeah .. i wasn't sure if this version was identicle or not, it sounds like
you added some stuff, but the key thing i was refering to was what
when showing a "field" we should display both the direct attributes as
well as any attributes it inherits from it's "fieldtype"


Currently, there is "usage" table on the field page, that contains a
link from the field to the fieldtype (the usage table on the fieldtype
page links back to all implementing fields) -- does that satisfy the
need, or is it important to display the fieldtype data embedded within
the field screen?

-Greg


[jira] Commented: (SOLR-75) XSLT-based "Schema Browser" in admin view

2006-12-06 Thread Greg Ludington (JIRA)
[ 
http://issues.apache.org/jira/browse/SOLR-75?page=comments#action_12456285 ] 

Greg Ludington commented on SOLR-75:


(Sent in email earlier, but adding it to the JIRA issue proper)

I do not know if you have seen the update, as opposed to the one
originally attached to an earlier JIRA issue, but this one should
include every attribute in a field or fieldtype -- the "attributes"
table should contain every attribute of the node.  Also, I included
(via cut-and-paste) the basic analysis form, so that it shows for each
field (and submits to analysis.jsp) as well.  If these do not fit what
you need, and do not have time to take a further look, I would be
happy to take suggestions for tweaks.

I thought about doing the transformation server-side as well, but I
stuck client-side because other admin pages rely on client-side
transformation.  I can rework it as a server-side transformation, if
that is preferable.  The only downsides to server-side approach would
be the extra (likely insignificant) burden on the server, and the size
of the page -- the transformed HTML will be an order of magnitude
larger than the XML.

As for the licensing, I did modify the code from an article, but it is
still largely intact.  I could easily write javascript that is
entirely free of the original article code, and/or contact the
original author for explicit permission.  As for the icons -- I am not
much of a graphic artist.  I could also rewrite the tree to use
characters instead, unless somebody can locate license free icons, or
perhaps the people redoing SOLR-76 could also create new icons of that
size?  (The XSL in this issue shares the base admin.css, so we may
have to rework the XSL to take SOLR-76 into account.)  If it is
permissible, I think it would be better to use the original code and
credit the author, both to give the original author deserved credit
for his idea and to minimize duplication of work on our parts :)

> XSLT-based "Schema Browser" in admin view
> -
>
> Key: SOLR-75
> URL: http://issues.apache.org/jira/browse/SOLR-75
> Project: Solr
>  Issue Type: New Feature
>  Components: web gui
>     Environment: any
>Reporter: Greg Ludington
>Priority: Minor
> Attachments: closed.gif, open.gif, solr75v1.diff
>
>
> The files in this upcoming patch create a simple "schema browser" for the 
> admin tool.  It serves schema.xml along with a stylesheet that in compliant 
> browsers creates a page with a tree widget to show the fieldtypes and fields, 
> as well as their uses and cross references.  This is similar to the 
> schemaxsl.zip originally attached to SOLR-58, but a few features have been 
> added, and the look and feel has been changed to fit in better with the rest 
> of the admin tool.
> Note that it does *not* work against the live IndexSchema -- it merely 
> transforms schema.xml.  There is probably not a significant difference now, 
> but it is worth raising the issue in case there are future administration 
> capabilities in mind (i.e. on 
> http://wiki.apache.org/solr/MakeSolrMoreSelfService ) that might require a 
> schema browser to be talking to the live values.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: [jira] Commented: (SOLR-75) XSLT-based "Schema Browser" in admin view

2006-12-06 Thread Greg Ludington

I do not know if you have seen the update, as opposed to the one
originally attached to an earlier JIRA issue, but this one should
include every attribute in a field or fieldtype -- the "attributes"
table should contain every attribute of the node.  Also, I included
(via cut-and-paste) the basic analysis form, so that it shows for each
field (and submits to analysis.jsp) as well.  If these do not fit what
you need, and do not have time to take a further look, I would be
happy to take suggestions for tweaks.

I thought about doing the transformation server-side as well, but I
stuck client-side because other admin pages rely on client-side
transformation.  I can rework it as a server-side transformation, if
that is preferable.  The only downsides to server-side approach would
be the extra (likely insignificant) burden on the server, and the size
of the page -- the transformed HTML will be an order of magnitude
larger than the XML.

As for the licensing, I did modify the code from an article, but it is
still largely intact.  I could easily write javascript that is
entirely free of the original article code, and/or contact the
original author for explicit permission.  As for the icons -- I am not
much of a graphic artist.  I could also rewrite the tree to use
characters instead, unless somebody can locate license free icons, or
perhaps the people redoing SOLR-76 could also create new icons of that
size?  (The XSL in this issue shares the base admin.css, so we may
have to rework the XSL to take SOLR-76 into account.)  If it is
permissible, I think it would be better to use the original code and
credit the author, both to give the original author deserved credit
for his idea and to minimize duplication of work on our parts :)

-Greg

On 12/6/06, Hoss Man (JIRA) <[EMAIL PROTECTED]> wrote:

[ 
http://issues.apache.org/jira/browse/SOLR-75?page=comments#action_12456253 ]

Hoss Man commented on SOLR-75:
--

I really haven't had the time to play with this that i hoped i would (i was 
really hoping to try and tweak it to add some logic to pull all of the 
fieldtype attributes into the field, and add some links from this tool out to 
the analysis page as well) but I just wanted to go on record that i think it's 
really cool.

Greg: if you are interested, one way to avoid the issues with get-files and the 
stylesheet hearder would be to write a new JSP and/or servlet just for powering 
the schema explorer that applies the transformation on the server side -- it 
should be fairly easy with the XSL Transform utility methods Bertrand added to 
support the XSLTResponseWriter. ... then we don't have to require the files 
have the correct stylesheet declaration, or inject the one we want, or rely on 
the browser to apply it properly.

As for the license issues ... i don't think we can use those images *or* any 
javascript you cut/paste from the article ... but i could be wrong.  if there 
are similar methods you can find that have an Apache compatible license, we 
should definitely be able to use those.

> XSLT-based "Schema Browser" in admin view
> -
>
> Key: SOLR-75
> URL: http://issues.apache.org/jira/browse/SOLR-75
> Project: Solr
>  Issue Type: New Feature
>  Components: web gui
> Environment: any
>Reporter: Greg Ludington
>Priority: Minor
> Attachments: closed.gif, open.gif, solr75v1.diff
>
>
> The files in this upcoming patch create a simple "schema browser" for the 
admin tool.  It serves schema.xml along with a stylesheet that in compliant browsers creates 
a page with a tree widget to show the fieldtypes and fields, as well as their uses and cross 
references.  This is similar to the schemaxsl.zip originally attached to SOLR-58, but a few 
features have been added, and the look and feel has been changed to fit in better with the 
rest of the admin tool.
> Note that it does *not* work against the live IndexSchema -- it merely 
transforms schema.xml.  There is probably not a significant difference now, but it 
is worth raising the issue in case there are future administration capabilities in 
mind (i.e. on http://wiki.apache.org/solr/MakeSolrMoreSelfService ) that might 
require a schema browser to be talking to the live values.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira





[jira] Updated: (SOLR-75) XSLT-based "Schema Browser" in admin view

2006-11-28 Thread Greg Ludington (JIRA)
 [ http://issues.apache.org/jira/browse/SOLR-75?page=all ]

Greg Ludington updated SOLR-75:
---

Attachment: solr75v1.diff
closed.gif
open.gif

XSLT and jsp changes to allow schema.xml to be explored via a DHTML tree 
widget, with some cross referencing of fields and fieldtypes according to their 
usages.

- Changes get-file.jsp so that it serves a content type of text/xml for files 
ending in .xml

- Dynamically inserts an xsl declaration in the requested xml (e.g. schema.xml) 
if an xsl request parameter is present.  (It unfortunately stuffs more code in 
get-file.jsp, but the alternative would be to force users to modify their 
schema.xml to add this declaration, which would be worse for several reasons.)

- Added a schema.xsl to perform the actual transformation, as well as icons to 
represent the open and closed states of the folder.

I have not checked the "grant license" option primarily because of those icons. 
 As credited in the XSL, I took the icons as well as adapted the tree 
javascript from this old DevX article:

http://www.devx.com/getHelpOn/Article/11874

There is no license at all on the article or on the code from that article, but 
I do not know ASF's policy on the use of such assets, as I cannot claim to have 
created them.  If these are unsuitable, I have written many versions of these 
sorts of scripts, and I could rewrite that party easily enough, but I am not a 
graphic artist.

Once those issues are cleared, or suitable replacements are found, I would be 
happy to resubmit and grant license to the patch.  In the meantime, I have 
submitted the code so people can take a look at it, and, if you find it useful, 
hopefully to test the XSLT and CSS on more browsers that I have at hand.

> XSLT-based "Schema Browser" in admin view
> -
>
> Key: SOLR-75
> URL: http://issues.apache.org/jira/browse/SOLR-75
> Project: Solr
>  Issue Type: New Feature
>  Components: web gui
>     Environment: any
>Reporter: Greg Ludington
>Priority: Minor
> Attachments: closed.gif, open.gif, solr75v1.diff
>
>
> The files in this upcoming patch create a simple "schema browser" for the 
> admin tool.  It serves schema.xml along with a stylesheet that in compliant 
> browsers creates a page with a tree widget to show the fieldtypes and fields, 
> as well as their uses and cross references.  This is similar to the 
> schemaxsl.zip originally attached to SOLR-58, but a few features have been 
> added, and the look and feel has been changed to fit in better with the rest 
> of the admin tool.
> Note that it does *not* work against the live IndexSchema -- it merely 
> transforms schema.xml.  There is probably not a significant difference now, 
> but it is worth raising the issue in case there are future administration 
> capabilities in mind (i.e. on 
> http://wiki.apache.org/solr/MakeSolrMoreSelfService ) that might require a 
> schema browser to be talking to the live values.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (SOLR-75) XSLT-based "Schema Browser" in admin view

2006-11-28 Thread Greg Ludington (JIRA)
XSLT-based "Schema Browser" in admin view
-

 Key: SOLR-75
 URL: http://issues.apache.org/jira/browse/SOLR-75
 Project: Solr
  Issue Type: New Feature
  Components: web gui
 Environment: any
Reporter: Greg Ludington
Priority: Minor


The files in this upcoming patch create a simple "schema browser" for the admin 
tool.  It serves schema.xml along with a stylesheet that in compliant browsers 
creates a page with a tree widget to show the fieldtypes and fields, as well as 
their uses and cross references.  This is similar to the schemaxsl.zip 
originally attached to SOLR-58, but a few features have been added, and the 
look and feel has been changed to fit in better with the rest of the admin tool.

Note that it does *not* work against the live IndexSchema -- it merely 
transforms schema.xml.  There is probably not a significant difference now, but 
it is worth raising the issue in case there are future administration 
capabilities in mind (i.e. on 
http://wiki.apache.org/solr/MakeSolrMoreSelfService ) that might require a 
schema browser to be talking to the live values.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (SOLR-58) Change Admin components to return XML like the rest of the system

2006-11-09 Thread Greg Ludington (JIRA)
 [ http://issues.apache.org/jira/browse/SOLR-58?page=all ]

Greg Ludington updated SOLR-58:
---

Attachment: schemaxsl.zip

This may be slightly OT for this issue, but since the ticket discusses XSLT in 
the browser, and the Self Service wiki page, I though to put up for comments a 
quick and dirty XSLT-based schema browser I made.  It transforms schema.xml 
into a single page tree view, so you can inspect fields and types, with some 
cross referencing between fieldtypes and fields, and field copyField 
source/dests.  I have only tried it against Firefox 2 and IE7, but it should 
work in all browsers with an XSLT engine, or, failing that, it could be done 
server-side, and the resulting HTML sent to the browser.

Unzipping and opening schema.xml in a browser is enough to see it, but it will 
be missing some styles and images referenced in the admin webapp.  To use it in 
a solr webapp against your own schema, you have to:

a) place schema.xml and the gifs in the /admin/ directory
b) add the xsl directive to your schema.xml


c) modify get-file.jsp to set a content type of text/xml, and not to emit any 
whitespace

I have only tried it with my prototype small-scale schemas, but, if it works 
for people with larger schemas, and fits in with the "Self Service" aims, I 
could take suggestions, clean up the xsl, and submit the a-b-c changes above as 
a patch.

> Change Admin components to return XML like the rest of the system
> -
>
> Key: SOLR-58
> URL: http://issues.apache.org/jira/browse/SOLR-58
> Project: Solr
>  Issue Type: New Feature
>  Components: web gui
>Reporter: Otis Gospodnetic
> Assigned To: Otis Gospodnetic
>Priority: Minor
> Attachments: analysis-xml-out.txt, analysis-xml.jsp, logging-xml.jsp, 
> ping-xml-out.txt, ping-xml.jsp, schemaxsl.zip, threaddump-xml-out.txt, 
> threaddump-xml.jsp
>
>
> I need to expose the admin functionality to an external application.  I think 
> returning admin data as XML may be a good and simple first step towards that.
> To do that I think I'll mostly need to modify JSPs (but I haven't had a good 
> look at Admin GUI yet).  From what I saw a few weeks ago when I briefly 
> looked at this, no Java code will need to be modified.  If you have concrete 
> ideas about how this should be done, please comment before I start next week 
> (week of October 23rd 2006).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: [jira] Commented: (SOLR-44) Basic Facet Count support

2006-08-31 Thread Greg Ludington

I definitely think we want to support stuff like this out of the box in
the long run, i think it just needs to be based on specifying the Facet
info in a more robust way (ie: XML configuration)



Not to threadjack, but this is actually the path I went down during my
faceting prototype, pushing all the facet configuration into a
facets.xml file, which is parsed at startup and the definitions stored
in a SolrCache.  Here is a snippet:



manu_exact
By Manufacturer
25



Instock

In Stock
inStock:true


Out of Stock
inStock:false




The client then asks for facets by name, e.g.
facet=manu&facet=instock, and gets back output that includes the name,
the count, and the queryString.  The client application cannot,
however, just ask for an arbitrary facetQuery, and I really like that
ability in your patch -- it probably hits 98/2 instead of just 80/20.

-Greg


Re: [jira] Commented: (SOLR-44) Basic Facet Count support

2006-08-30 Thread Greg Ludington

init/query params -- but as i type this, it occurs to me that one approach
would be to allow for "per field" usages of the of the "facet.query" param
to specify queries that would use a SolrQueryParser with the default
field set to the specified field, so that you could things like...

facet.query=foo:bar
f.price.facet.query=[*+TO+100]
f.price.facet.query=[101+TO+*]
facet.field=cat

...and all of the "f.price.facet.query" counts would be grouped together
seperate from the count for "facet.query=foo:bar"

...the hitch here is this isn't really a "per field override" of the
facet.query param ... so the API might confuse some people.  We would also
either need to change SolrParams so that it's possible to get a list of
all set param names matching a pattern, or we'd need another param name
listing which fields we should expect to find f.*.facet.query params for.

Did you have any thoughts on what a grouping API could look like for the
query based facets ?


I was hoping you would not call me out on this :) -- I think we have
been thinking along similar lines, but just about every alternative I
have tried leaves something to be desired, in that you either end up
with alot of extra String parsing or a very awkward/brittle url
format.  One possibility might be to add a sort of namespace to the
params themselves:

facet.query.byprice=price:[*+TO+100]
facet.query.byprice=price:[101+TO+*]

This is similar to the f. approach, though the format would
be different enough to avoid confusion.  The downside, as you have
suggested, is that you have to add some manner of
getParamStartsWith(...) method, and that might not be too efficient.
The only way I could see to avoid that getsParamStartsWith(..) method
would be to pass in separately a list of the groupings you want:

facet.query.groups=byprice
facet.query.byprice=price:[*+TO+100]
facet.query.byprice=price:[101+TO+*]

and then you look for those grouping fields with a getFieldParams(..)
method that could use "facet.query." as its prefix instead of "f." --
.but at that point the request URL is getting very complicated.
Alternatively, you could have a simpler URL by putting it all on the
value side, as in:

facet.query=byprice|price:[*+TO+100]
facet.query=byprice|price:[101+TO+*]

and use splitList (like the highlighter) or some similar mechanism to
separate the group and query portions.  The obvious downside here is
making sure not to split incorrectly, and it limits adding additional
attributes later.

Getting back to what you said about the 80/20 rule, you certainly have
hit that sweet spot.  It may be that in just about every use case (or
at least 80% of them :) ) the client can, at worst, extract the field
name from the  name attribute, and use that for grouping.  While
explicit grouping control would be nice, it may be overcomplicating
things, unless somebody has another approch, or ideas that overcome or
minimize the drawbacks above.

-Greg


Re: [jira] Commented: (SOLR-44) Basic Facet Count support

2006-08-30 Thread Greg Ludington

I'd like to get some feedback on the overall appraoch and params before i 
proceed too much farther.


These comments are probably just confusion since the approach differs
from my home-grown faceting prototype, and my dev box is on a moving
truck right now, so I cannot try the patch, so please bear with me:

1) Should grouping of facets also be parameter-based?   Say, for
instance, I want to have multiple different ways to look into my
result set:

By Price (<$100, $100-$200, $200+)
By Manufactuerer (Apple, Dell, HP, SONY)
By Status: (In Stock, Out of Stock)

I assume the first two would be 4 facetQuery params each, and the
third would be a single facetField, can the output format represent
these sorts of logical groupings, or should it be solely the client's
responsibility to parse and split?

2) If facets can be returned in such a logical grouping, it might also
be worthwhile to allow an optional sort order for the facets (e.g.
alphabetically, by count, etc).  While the client can certainly sort,
if there is a facet limit the client will not be able to sort on the
full set.

3)  We have one running application (not yet on Solr;  that is my
prototype :) ) where the boundaries on range-based facets are
calculated to achieve equal distribution among a known number of
facets.  (This is like facetField, except for ranges, not terms.)

4) Would it be possible to extend the types of facets?  Because,
admittedly, #3 is an application-specific case, I would not expect
some general-purpose solution to it.  However, when confronted with
such a need, it would be nice to be able to plug in a new facet
impementation type for a given facet without having to change Solr
internals and/or create and maintain nearly exact duplicates of
existing request handlers.  (In my prototype, I had the concept of a
Facet interface, which allows this, but in other respects is far less
flexible than what you have outlined.)

Thanks,
Greg


Re: [jira] Resolved: (SOLR-39) Searcher's getDocListAndSet methods do not accept flags, can cause NPE when writing output

2006-07-25 Thread Greg Ludington

It may be a somewhat obscure pathway to produce this -- I only came
across it when, in applying faceting, I was using getDocListAndSet to
return both the DocList for output and the DocSet for facet
calculations, without fetching documents in any other way. Scores are
set to null here -- but, as you indicate, they are also set to null if
you getDocListNC, but that does not end up with an error.  I agree
that the underlying issue should also be addressed, as well, but I
have not dug deeply enough into the internals to see the cause yet.

Here is the stack trace, when using a getDocListAndSet method without flags:

09> Started [EMAIL PROTECTED]
Jul 25, 2006 9:32:22 PM org.apache.solr.core.SolrCore execute
INFO: rows=10&explainOther=&start=0&indent=on&q=dell&fl=&qt=dismax&stylesheet=&v
ersion=2.1 0 140
Jul 25, 2006 9:32:22 PM org.apache.solr.core.SolrException log
SEVERE: java.lang.NullPointerException
   at org.apache.solr.search.DocSlice$1.score(DocSlice.java:116)
   at org.apache.solr.request.XMLWriter.writeDocList(XMLWriter.java:346)
   at org.apache.solr.request.XMLWriter.writeVal(XMLWriter.java:385)
   at org.apache.solr.request.XMLWriter.writeResponse(XMLWriter.java:106)
   at org.apache.solr.request.XMLResponseWriter.write(XMLResponseWriter.jav
a:29)
   at org.apache.solr.servlet.SolrServlet.doGet(SolrServlet.java:96)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:596)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
   at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:428
)
   at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicati
onHandler.java:473)
   at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:5
68)
   at org.mortbay.http.HttpContext.handle(HttpContext.java:1530)
   at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplication
Context.java:633)
   at org.mortbay.http.HttpContext.handle(HttpContext.java:1482)
   at org.mortbay.http.HttpServer.service(HttpServer.java:909)
   at org.mortbay.http.HttpConnection.service(HttpConnection.java:820)
   at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:986)
   at org.mortbay.http.HttpConnection.handle(HttpConnection.java:837)
   at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:
245)
   at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
   at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)


Thanks,
Greg

On 7/25/06, Chris Hostetter <[EMAIL PROTECTED]> wrote:


: Thanks Greg, I just committed this.

I'm all in favor of the patch as commited, but the NPE still concerens me
... the OutputWriter should be able to cleanly deal with a DocList that
doesn't contain scores right?

Should we open a seperate issue to look into this? ... it seems like it
must be a somewhat obscure code path since I've certainly used Solr
without scores in the past.

Greg: do you by any chance have a stacktrace so we can see exactly where
the NPE was getting thrown from?

Yonik: do you have any idea what code path might cause an NPE?

:
: > Searcher's getDocListAndSet methods do not accept flags, can cause NPE when 
writing output
: > 
--
: >
: > Key: SOLR-39
: > URL: http://issues.apache.org/jira/browse/SOLR-39
: > Project: Solr
: >  Issue Type: Bug
: >  Components: search
: >Reporter: Greg Ludington
: > Assigned To: Yonik Seeley
: >Priority: Minor
: > Attachments: SolrIndexSearcherdocListAndSet.patch
: >
: >
: > SolrIndexSearcher's getDocListAndSet methods do not accept flags, which 
can, in some cases, cause a Null Pointer Exception to be thrown when writing the 
docListAndSet.docList as output.  I came across the issue as I was implementing 
faceting, see 
http://www.nabble.com/Faceted-Browsing-Question-Discussion-tf1968854.html for the 
discussion.
: > The simplest way to reproduce this is to modify DisMaxRequestHandler, by 
changing this:
: >  DocList results = s.getDocList(query, restrictions,
: >  SolrPluginUtils.getSort(req),
: >  req.getStart(), req.getLimit(),
: >  flags);
: >   rsp.add("search-results",results);
: > to
: >   DocListAndSet listAndSet= s.getDocListAndSet(query, restrictions,
: >  SolrPluginUtils.getSort(req),
: >  req.getStart(), req.getLimit());
: >   DocList results = listAndSet.docList;
: >   rsp.add("search-results",results);
: > The root cause appe

[jira] Updated: (SOLR-39) Searcher's getDocListAndSet methods do not accept flags, can cause NPE when writing output

2006-07-25 Thread Greg Ludington (JIRA)
 [ http://issues.apache.org/jira/browse/SOLR-39?page=all ]

Greg Ludington updated SOLR-39:
---

Attachment: SolrIndexSearcherdocListAndSet.patch

> Searcher's getDocListAndSet methods do not accept flags, can cause NPE when 
> writing output
> --
>
> Key: SOLR-39
> URL: http://issues.apache.org/jira/browse/SOLR-39
> Project: Solr
>  Issue Type: Bug
>  Components: search
>    Reporter: Greg Ludington
>Priority: Minor
> Attachments: SolrIndexSearcherdocListAndSet.patch
>
>
> SolrIndexSearcher's getDocListAndSet methods do not accept flags, which can, 
> in some cases, cause a Null Pointer Exception to be thrown when writing the 
> docListAndSet.docList as output.  I came across the issue as I was 
> implementing faceting, see 
> http://www.nabble.com/Faceted-Browsing-Question-Discussion-tf1968854.html for 
> the discussion.
> The simplest way to reproduce this is to modify DisMaxRequestHandler, by 
> changing this:
>  DocList results = s.getDocList(query, restrictions,
>  SolrPluginUtils.getSort(req),
>  req.getStart(), req.getLimit(),
>  flags);
>   rsp.add("search-results",results);
> to
>   DocListAndSet listAndSet= s.getDocListAndSet(query, restrictions,
>  SolrPluginUtils.getSort(req),
>  req.getStart(), req.getLimit());
>   DocList results = listAndSet.docList;
>   rsp.add("search-results",results);
> The root cause appears to be that the scores[] is set to null, so then the 
> DocIterator and its score() method is called, return scores[pos-1] will give 
> null.  When getDocListAndSet(..) is invoked, it eventually can get down to 
> this private method:
>   private DocSet getDocListAndSetNC(DocListAndSet out, Query query, DocSet 
> filter, Sort lsort, int offset, int len, int flags) throws IOException
> In that method, scores is assigned as follows:
>   scores = (flags&GET_SCORES)!=0 ? new float[nDocsReturned] : null;
> Since getDocListAndSet() does not pass flags (except for the implicit 
> GET_DOCSET), scores is assigned as null, which eventually leads to the 
> NullPointerException if you try to output the docList .  The attached patch 
> does not change the underlying mechanism of how scores is assigned, but works 
> around the issue by adding overloaded getDocListAndSet() methods that take an 
> additional flags parameter.  After applying this patch, you can change the 
> relevant bit in DisMaxRequestHandler to:
>   DocListAndSet listAndSet= s.getDocListAndSet(query, restrictions,
>  SolrPluginUtils.getSort(req),
>  req.getStart(), req.getLimit(), flags);
>   DocList results = listAndSet.docList;
>   rsp.add("search-results",results);
> and you will no longer see the NullPointerException

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (SOLR-39) Searcher's getDocListAndSet methods do not accept flags, can cause NPE when writing output

2006-07-25 Thread Greg Ludington (JIRA)
Searcher's getDocListAndSet methods do not accept flags, can cause NPE when 
writing output
--

 Key: SOLR-39
 URL: http://issues.apache.org/jira/browse/SOLR-39
 Project: Solr
  Issue Type: Bug
  Components: search
Reporter: Greg Ludington
Priority: Minor
 Attachments: SolrIndexSearcherdocListAndSet.patch

SolrIndexSearcher's getDocListAndSet methods do not accept flags, which can, in 
some cases, cause a Null Pointer Exception to be thrown when writing the 
docListAndSet.docList as output.  I came across the issue as I was implementing 
faceting, see 
http://www.nabble.com/Faceted-Browsing-Question-Discussion-tf1968854.html for 
the discussion.

The simplest way to reproduce this is to modify DisMaxRequestHandler, by 
changing this:

 DocList results = s.getDocList(query, restrictions,
 SolrPluginUtils.getSort(req),
 req.getStart(), req.getLimit(),
 flags);
  rsp.add("search-results",results);

to

  DocListAndSet listAndSet= s.getDocListAndSet(query, restrictions,
 SolrPluginUtils.getSort(req),
 req.getStart(), req.getLimit());
  DocList results = listAndSet.docList;
  rsp.add("search-results",results);

The root cause appears to be that the scores[] is set to null, so then the 
DocIterator and its score() method is called, return scores[pos-1] will give 
null.  When getDocListAndSet(..) is invoked, it eventually can get down to this 
private method:

  private DocSet getDocListAndSetNC(DocListAndSet out, Query query, DocSet 
filter, Sort lsort, int offset, int len, int flags) throws IOException

In that method, scores is assigned as follows:

  scores = (flags&GET_SCORES)!=0 ? new float[nDocsReturned] : null;

Since getDocListAndSet() does not pass flags (except for the implicit 
GET_DOCSET), scores is assigned as null, which eventually leads to the 
NullPointerException if you try to output the docList .  The attached patch 
does not change the underlying mechanism of how scores is assigned, but works 
around the issue by adding overloaded getDocListAndSet() methods that take an 
additional flags parameter.  After applying this patch, you can change the 
relevant bit in DisMaxRequestHandler to:

  DocListAndSet listAndSet= s.getDocListAndSet(query, restrictions,
 SolrPluginUtils.getSort(req),
 req.getStart(), req.getLimit(), flags);
  DocList results = listAndSet.docList;
  rsp.add("search-results",results);

and you will no longer see the NullPointerException

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Faceted Browsing Question/Discussion

2006-07-21 Thread Greg Ludington

Don't let me discourage you from having a "FacetHandler" interface that
supports generic faceting functionality using different rules (ie:


I don't get discouraged -- I just take advice from smart people when
they offer it :).  I still do have my generic Facet handling
mechanism, because the Facets/FacetItems were always interfaces.  Now,
they end up as Map in a user cache instead of requiring a new breed of
handler,  which lets me accomplish much the same thing without doing
violence to Solr internals.


take a look at SolrIndexSearcher.getDocListAndSet -- it efficiently gets
you both the "paginated" DocList to display for this request and the full
DocSet for doing faceting at the same time, and it can also take in a list
of Filtering queries.  Then you can pass in the resulting DocSet as the
second arg to numDocs.


This perfectly fits my needs, but has one problem.  Since the
requestHandler flags are not passed through, XMLWriter throws a
NullPointerException when it encounters iterator.score() when trying
to write out a results.docList with scores.  Adding the flags argument
to the getDocListAndSet methods in SolrIndexSearcher appears to solve
that problem, i.e.:

 public DocListAndSet getDocListAndSet(Query query, List
filterList, Sort lsort, int offset, int len, int flags) throws
IOException {
   DocListAndSet ret = new DocListAndSet();
   getDocListC(ret,query,filterList,null,lsort,offset,len, flags |=
GET_DOCSET);
   return ret;
 }

as well as adding the flags argument to the other getDocListAndSet
method, so that the requestHandler flags are passed through.  Is there
a performance or other reason that flags are not present in the
getDocListAndSet signatures, as opposed to the getDocList, or should I
submit this as a JIRA issue/patch?

Thanks,
Greg


Re: Faceted Browsing Question/Discussion

2006-07-20 Thread Greg Ludington

After thinking over your comments, I removed the facetHandler
completely, instead loading the Facets into a plain user cache, and
put the output work in a utils class similar to SolrPluginUtils.  It
complicates the Term caching for me slightly, but it allows me to add
a "FacetUtils.doStandardFaceting(req, query, results);" type of call
to any requestHandler without any changes to Solr internals.  Thank
you for pointing me in that general direction.

On your comment about searcherNumDocs() and
valueDocSet.intersectionSize() being essentially equivalent, I found I
could do the job with either, but noted two differences that may be
obvious, but may be worth documenting explicitly for people developing
faceting:

- Since valueDocSet.intersectionSize(otherSet) compares the actual
result set. the requestHandler needs to get a full (or at least
larger) set before limiting it by req.getStart() and req.getLimit(),
or you only calculate the facets against that one page.  Then, after
you calculate the facets, you can use subset() to restrict the range
for output.

- searcher.numDocs(resultQuery, facetQuery) does not require any
subset steps, but, since it uses the Query, not the end DocList, it
does not know about any filter queries applied to the request.  The
facet intersection will therefore be calculated against documents that
are not returned in the base results NamedList.

Thanks,
Greg


Re: Faceted Browsing Question/Discussion

2006-07-20 Thread Greg Ludington

that's a very good way to do it.  You could also use
SolrIndexSearcher.numDocs -- it is esentially the same thing, but in the
future there may be optimizations that can be done to eliminate the
construction of one DocSet (if the other one already exists)


Thank you for the tip -- I will take a look at it.


: general.  To that end, my current structure defines:
:
: - a  entry in solrconfig.xml, the only current
: implementation of which loads a set of Facet definitions from an xml
: file.
: - each Facet contains an id for lookups and a List of FacetItems (some
: statically configured, some constructed dynamically from available
: Terms, though not backed by any cache yet.)
: - each FacetItem contains a displayName and Query (and associated queryString)

I'm not sure i understahnd what exactly the "facetHandler" registration
gains you that you couldn't have achieved in a custome requestHandler
(without needing to modify the internals/config parrsing and so on) ...
your custom request handler could take in a "FacetHandler" class name as
an init param, or it could have taken in the XML information directly as
deeply nested set of init params. am i missing something else?


No, your are not missing anything; that may be a less intrusive way to
do this.  I wouldn't be the first time I have out-thought myself :).
I started with the concept of a facetHandler, becasue my goal was
faceting as a utility different handlers could share, as opposed to
the responsibility of a specific requestHandler.  This allows:

1) The adding/reusing of faceting to any requestHandler with minimal
code.   While, as you point out, I could have done this with a custom
requestHandler, this approach lets any requestHandler add faceting.  I
have it currently inside StandardRequestHandler and
DisMaxRequestHandler.  (Admittedly, these now qualify as custom
request handlers, but the code block is simply "if request has
faceting params, add the facets requested to the output".)

2) One configuration place for faceting.  I have a very mixed content
index, which could contain Reviews, Articles, and Products of
different categories, each of which could potentially have a
separately configured requestHandler to give their specific fields
appropriate default weight.  I would rather not have to specify (or
load) the facet "handling" in every requestHandler definition.

3) Faceting to differ per-request.   The request has to be able to
specify what facets to put in the output, and this required a change
to SolrQueryRequest to add these parameters.  (It seemed more
appropriate to add them explicitly, rather than grabbing them from the
args.)

Of these, #1 is probably the only strong reason to have a concept of a
facetHandler -- #2 and #3 are a matter of preference.

The changes to Solr proper were surprisingly small, which is a
testament to your initial structure, but I did have to add elements to
SolrQueryRequest and SolrCore.  Placing the responsibility for
faceting inside a custom requestHandler may restrict its reusability
in other handlers, but it would eliminate the need for these changes.
Perhaps a more appropriate middle way for me would be to load the
"facetHandler" as a user cache, as opposed to its own named item, and
acheive sharing across requestHandlers that way.  This raises its own
set of problems, but it would be far less invasive.

Thanks,
Greg


Faceted Browsing Question/Discussion

2006-07-19 Thread Greg Ludington

I have implemented faceted browsing in prototype I have been working
on with Solr, but I would like to ask some more experienced hands
about performance implications.  Currently,  I calculate the count of
a given facet as follows:

  DocSet valueDocSet = req.getSearcher().getDocSet(item.getQuery());
  long count = valueDocSet.intersectionSize(results);

Is this the preferred way to obtain such a count, or ithere another
way, such as dealing directly with BitSets (something I avoided, since
it appears getBits() is deprecated in the DocSet interface)?
Similarly, since this method is commented as "cache-aware", does that
mean that the item itself does not need to worry about caching its
results, only its terms, since the results will end up in the
queryResultCache?  Or is this assumption incorrect, and should each
facet/item be concerned with caching its results as well?

Apologies for sending this to solr-dev, and not solr-user, but I
thought this might also segue into a discussion on faceted browsing in
general.  To that end, my current structure defines:

- a  entry in solrconfig.xml, the only current
implementation of which loads a set of Facet definitions from an xml
file.
- each Facet contains an id for lookups and a List of FacetItems (some
statically configured, some constructed dynamically from available
Terms, though not backed by any cache yet.)
- each FacetItem contains a displayName and Query (and associated queryString)

Adding these parameters to the query, then a request with these parameters:
&ft=xmlfacets&f=man&f=instock

Would use the facetHandler "xmlfacets" to add this to the results:




  manu_exact:"ASUS Computer Inc."
  0
  ASUS Computer Inc.


  manu_exact:"ATI Technologies"
  0
  ATI Technologies


  manu_exact:"Dell, Inc."
  1
  Dell, Inc.




  inStock:true
  1
  In Stock


  inStock:false
  0
  Out of Stock




The basic handling and output format work for my prototype's purposes,
but I have not delved deeply into caching at this time. Does this
setup seem appropriate, and the abovementioned caching assumption seem
valid, or have I missed something that would help support facets on a
larger scale?

Thanks,
Greg


[jira] Commented: (SOLR-6) Solr admin stylesheet doesn't work well with Internet Explorer

2006-07-07 Thread Greg Ludington (JIRA)
[ http://issues.apache.org/jira/browse/SOLR-6?page=comments#action_12419801 
] 

Greg Ludington commented on SOLR-6:
---

Wonky is the correct word -- if the border-left is one pixel, it shows as 
white.  If it is two pixels, the color shows.  A border-right appears to work 
even with only one pixel.  Maybe IE7 gets it right, but not IE6 :)

If you are going to continue to have two column analysis tables, and the first 
column is a th, adding this appears to do the trick in both IE6 and Firefox 1.5:

table.analysis th {
border-right:1px solid #ff9933;
}



> Solr admin stylesheet doesn't work well with Internet Explorer
> --
>
>  Key: SOLR-6
>  URL: http://issues.apache.org/jira/browse/SOLR-6
>  Project: Solr
> Type: Bug

>   Components: web gui
> Reporter: Yonik Seeley
> Priority: Minor
>  Attachments: iestyles.patch
>
> The admin pages look different on firefox than on IE (6 or 7).
>  - tables in text analysis page span whole browser, regardless of cell size
>  - separators visible in firefox aren't visible in IE

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira



[jira] Updated: (SOLR-6) Solr admin stylesheet doesn't work well with Internet Explorer

2006-07-06 Thread Greg Ludington (JIRA)
 [ http://issues.apache.org/jira/browse/SOLR-6?page=all ]

Greg Ludington updated SOLR-6:
--

Attachment: iestyles.patch

The visual differences occur because IE 6 (I do not have IE7 to test) does not 
seem to apply border to rows, only to cells, and also does not understand some 
of the more advanced selectors you have used, such as XSLT-ish select by 
attributes, or by child and adjacent sibling.  I am a new user to solr, but 
this seemed like a low-risk area where I could pitch in and help.

This proposed patch puts the relevant css information into regular css classes, 
and then adjusts the jsps to match, e.g. instead of 
   tr > td[name="highlight"]:first-child

you have
   td.highlight

The meanings are slightly different, and they apply to a broader range of HTML, 
but it improves the look in IE and did not seem to have any negative impact in 
Firefox 1.5.0.4.  I did not migrate every css2 selector over to a class this 
way, but just those that made a large difference in the way IE6 rendered the 
page without resorting to css hackery.

If I missed any locations, or their are awkward results in this style, feel 
free to comment. I purposely did not make any effort to modify tabular.xsl, as 
this thread indicates it is deprecated:

http://www.nabble.com/stylesheet-issue-tf1721121.html#a4675018

> Solr admin stylesheet doesn't work well with Internet Explorer
> --
>
>  Key: SOLR-6
>  URL: http://issues.apache.org/jira/browse/SOLR-6
>  Project: Solr
> Type: Bug

>   Components: web gui
> Reporter: Yonik Seeley
> Priority: Minor
>  Attachments: iestyles.patch
>
> The admin pages look different on firefox than on IE (6 or 7).
>  - tables in text analysis page span whole browser, regardless of cell size
>  - separators visible in firefox aren't visible in IE

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira