[
https://issues.apache.org/jira/browse/SOLR-284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12724871#action_12724871
]
Yonik Seeley commented on SOLR-284:
-----------------------------------
Apologies for not reviewing this sooner after it was committed - but this is
the last/best chance to improve the interface before 1.4 is released (and this
is very important new functionality).
Since the "ext." seems unnecessary and removing is already a name change, we
might as well revisit the names themselves anyway. Here are my first thoughts
on it:
{code}
//////// generic type stuff that could be reused by other update handlers
boost.myfield=2.3
literal.myfield=Hello
map.origfield=newfield
uprefix=attr_
// map any unknown fields using a standard prefix... good for
// dynamic field mapping.
//////// more solr cell specific
capture.target_field=div
// does capture + field-map in single step... avoids name clashes
xpath=xpath_expr
// future: could do xpath.targetfield=xpath_expr
extract_only=true // period's aren't word separators, but scoping operators
// in the future, this could be replaced with a generic update operation
// to return the document(s) instead of indexing them.
resource.name=test.pdf
New idea:
nicenames=true // Last-Modified -> last_modified
REMOVED:
ext.ignore.und.fl
// throwing an exception when a field-type doesn't exist is generic
// and not needed. we should never silently ignore.
ext.idx.attr
// do we ever want this to be false? we can ignore all attributes
// with field mappings if we want to
ext.metadata.prefix
// seems like we only want to map unknown fields, not all fields
ext.def.fl
// we can use a standard field name for indexing main content
// and use map to move it if desired. "content"?
{code}
Do people view this as an improvement?
> Parsing Rich Document Types
> ---------------------------
>
> Key: SOLR-284
> URL: https://issues.apache.org/jira/browse/SOLR-284
> Project: Solr
> Issue Type: New Feature
> Components: update
> Reporter: Eric Pugh
> Assignee: Grant Ingersoll
> Fix For: 1.4
>
> Attachments: libs.zip, rich.patch, rich.patch, rich.patch,
> rich.patch, rich.patch, rich.patch, rich.patch, SOLR-284-no-key-gen.patch,
> SOLR-284.patch, SOLR-284.patch, SOLR-284.patch, SOLR-284.patch,
> SOLR-284.patch, SOLR-284.patch, SOLR-284.patch, solr-word.pdf, source.zip,
> test-files.zip, test-files.zip, test.zip, un-hardcode-id.diff
>
>
> I have developed a RichDocumentRequestHandler based on the CSVRequestHandler
> that supports streaming a PDF, Word, Powerpoint, Excel, or PDF document into
> Solr.
> There is a wiki page with information here:
> http://wiki.apache.org/solr/UpdateRichDocuments
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.