[ 
https://issues.apache.org/jira/browse/CONNECTORS-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970111#comment-15970111
 ] 

Karl Wright commented on CONNECTORS-1410:
-----------------------------------------

[~kamaci] I think having the BODY present twice in the indexed document is 
confusing and unnecessary.  Since we've already broken backwards compatibility 
for this connector, if we're going to index the body as the main document 
content, I think we might as well remove the body from all consideration for 
the metadata.  What do you think?


> Binary Attachment Data as Plain Text at Email Content
> -----------------------------------------------------
>
>                 Key: CONNECTORS-1410
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1410
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Email connector
>    Affects Versions: ManifoldCF 2.6
>            Reporter: Furkan KAMACI
>            Assignee: Furkan KAMACI
>             Fix For: ManifoldCF 2.8
>
>         Attachments: CONNECTORS-1410.patch, CONNECTORS-1410.patch
>
>
> Previously, we were indexing e-mails and its attachments together. We changed 
> this logic with CONNECTORS-1375 as indexing e-mail and its attachments 
> separately.
> However, there is a problem. Content fields of emails which has attachment(s) 
> includes both body and attachments's binary content as plain text.
> As we index attachments separately, we can just index body as content instead 
> of appending email body and all attachments' binary data as plain text.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to