[ 
https://issues.apache.org/jira/browse/SOLR-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414531#comment-13414531
 ] 

Jan Høydahl commented on SOLR-3619:
-----------------------------------

bq. we are renaming it from example because we are recognizing people use it as 
the default to built on. 
And that's all fine - people need to start somewhere. But if they think that 
adding a few <field>s to schema.xml is all Solr has to offer they'll build 
crappy search apps - I've seen many of these out there. So in calling it 
example (or template or skeleton or whaterver) gives people a hint that it's 
not anything that they should expect to be sufficient for their need without 
some more tuning (Solr is not GSA..)

{quote}
bq. Today's "collection1" is not very well tuned for PDF/HTML kind of docs
I haven't tried it in a while... can we improve it w/o getting in the way of 
people who don't use solr-cell?
{quote}
When Solr is compared to various other search engines, what they tend to test 
is web/filesystem crawling. So I really think that if we should include ONE 
main "example" config, it should be geared towards HTML/PDF/DOC indexing, 
either from crawling or pushing stuff from filesystem. That would mean that you 
have a title, a teaser, body, URL/path and various metadata. There has been 
some discussion on the list about improving user experience for such type of 
input.

Sure, it is harder (much harder) to get excellent results from unstructured 
text than from some nice synthetic structured xml docs, so it would take some 
work to let Solr shine in those comparisons. One needed piece could be an 
improved post.jar (or an feeder wrapper script) which can recursively traverse 
folders and push files matching certain file types, with the correct MIME and 
unique ID. That would let people quickly index, say, their home folder, and 
then view the results in Solritas.
                
> Rename 'example' dir to 'server'
> --------------------------------
>
>                 Key: SOLR-3619
>                 URL: https://issues.apache.org/jira/browse/SOLR-3619
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Mark Miller
>            Assignee: Mark Miller
>             Fix For: 4.0, 5.0
>
>         Attachments: SOLR-3619.patch, server-name-layout.png
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to