[ 
https://issues.apache.org/jira/browse/SOLR-284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663150#action_12663150
 ] 

Chris Harris commented on SOLR-284:
-----------------------------------

bq. I could, however, see adding a flag to specify whether one wants "silent 
success" or not. I think the use case for content extraction is different than 
the normal XML message path. Often times, these files are quite large and the 
cost of sending them to the system is significant.

In my own use case of the handler, I imagine the fail-on-missing-key policy 
would be the more helpful policy. This is because I want to be in control of my 
own key, and if Solr fails as soon as I don't provide one, that's going to help 
me find the bug in my indexing code right away, whereas "silent success" will 
allow that bug to fester. I'm not sure there would be significant 
countervailing advantages to the other policy. It's true that transferring a 
large file when you're just going to get an error message wastes some time, but 
I feel like in debugging there's potential to waste a lot more time.

My first choice would be for fail-on-missing-key to be the default, followed by 
having an easy-to-set flag. In any case, though, it would be nice not to have 
to create a custom SolrContentHandler just to get this one sanity check.

> Parsing Rich Document Types
> ---------------------------
>
>                 Key: SOLR-284
>                 URL: https://issues.apache.org/jira/browse/SOLR-284
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Eric Pugh
>            Assignee: Grant Ingersoll
>             Fix For: 1.4
>
>         Attachments: libs.zip, rich.patch, rich.patch, rich.patch, 
> rich.patch, rich.patch, rich.patch, rich.patch, SOLR-284.patch, 
> SOLR-284.patch, SOLR-284.patch, SOLR-284.patch, SOLR-284.patch, 
> SOLR-284.patch, SOLR-284.patch, solr-word.pdf, source.zip, test-files.zip, 
> test-files.zip, test.zip, un-hardcode-id.diff
>
>
> I have developed a RichDocumentRequestHandler based on the CSVRequestHandler 
> that supports streaming a PDF, Word, Powerpoint, Excel, or PDF document into 
> Solr.
> There is a wiki page with information here: 
> http://wiki.apache.org/solr/UpdateRichDocuments
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to