[
https://issues.apache.org/jira/browse/JAMES-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17862480#comment-17862480
]
aleksey edited comment on JAMES-4049 at 7/2/24 3:24 PM:
--------------------------------------------------------
Dear, [~btellier], please advise me on the best way to proceed. When working
with Preview, I encountered that ` ` are not processed. For example, if
you take an
[^Message17199066550193266373.eml]
message and process it using, `new Factory(new MessageContentExtractor(), new
JsoupHtmlTextExtractor())`, you will get a string similar to 'Complete your
daily lesson, improve your English. NBSP NBSP NBSP NBSP NBSP ... NBSP' in
response. It forces me to manually delete unnecessary characters.
{code:java}
import lombok.NonNull;
import lombok.extern.slf4j.Slf4j;
import org.apache.james.jmap.draft.utils.JsoupHtmlTextExtractor;
import org.apache.james.mime4j.dom.Message;
import org.apache.james.util.html.HtmlTextExtractor;
import org.apache.james.util.mime.MessageContentExtractor;
import org.springframework.stereotype.Service;
import java.io.IOException;
import java.util.Optional;
import java.util.function.Predicate;
import static org.apache.james.jmap.api.model.Preview.Factory;
@Slf4j
@Service
public class PreviewServiceImpl implements PreviewService {
private final Factory factory;
public PreviewServiceImpl() {
MessageContentExtractor messageContentExtractor = new
MessageContentExtractor();
HtmlTextExtractor htmlTextExtractor = new JsoupHtmlTextExtractor();
this.factory = new Factory(messageContentExtractor, htmlTextExtractor);
}
@Override
public Optional<Preview> getPreview(@NonNull Message message) {
return Optional.of(message)
.map(this::extractCandidateToPreview)
.map(this::removeZeroWidthNonJoiner)
.map(String::strip)
.filter(Predicate.not(String::isEmpty))
.map(Preview::new);
}
private String extractCandidateToPreview(Message message) {
try {
return factory.fromMime4JMessage(message).getValue();
} catch (IOException e) {
log.warn("Failed to extract preview from message", e);
return "";
}
}
/**
* Removes ZERO WIDTH NON-JOINER characters
*
* @param text Text to check
* @return Checked text
*/
private String removeZeroWidthNonJoiner(String text) {
return text.replaceAll("\\u200C", "");
}
} {code}
Do you think this is an issue with the Preview class or the
JsoupHtmlTextExtractor class? Based on your response, I will open another issue
for improvement.
was (Author: JIRAUSER305652):
Dear, [~btellier], please advise me on the best way to proceed. When working
with Preview, I encountered that are not processed. For example, if you
take an
[^Message17199066550193266373.eml]
message and process it using, `new Factory(new MessageContentExtractor(), new
JsoupHtmlTextExtractor())`, you will get a string similar to 'Complete your
daily lesson, improve your English. NBSP NBSP NBSP NBSP NBSP ... NBSP' in
response. It forces me to manually delete unnecessary characters.
{code:java}
import lombok.NonNull;
import lombok.extern.slf4j.Slf4j;
import org.apache.james.jmap.draft.utils.JsoupHtmlTextExtractor;
import org.apache.james.mime4j.dom.Message;
import org.apache.james.util.html.HtmlTextExtractor;
import org.apache.james.util.mime.MessageContentExtractor;
import org.springframework.stereotype.Service;
import java.io.IOException;
import java.util.Optional;
import java.util.function.Predicate;
import static org.apache.james.jmap.api.model.Preview.Factory;
@Slf4j
@Service
public class PreviewServiceImpl implements PreviewService {
private final Factory factory;
public PreviewServiceImpl() {
MessageContentExtractor messageContentExtractor = new
MessageContentExtractor();
HtmlTextExtractor htmlTextExtractor = new JsoupHtmlTextExtractor();
this.factory = new Factory(messageContentExtractor, htmlTextExtractor);
}
@Override
public Optional<Preview> getPreview(@NonNull Message message) {
return Optional.of(message)
.map(this::extractCandidateToPreview)
.map(this::removeZeroWidthNonJoiner)
.map(String::strip)
.filter(Predicate.not(String::isEmpty))
.map(Preview::new);
}
private String extractCandidateToPreview(Message message) {
try {
return factory.fromMime4JMessage(message).getValue();
} catch (IOException e) {
log.warn("Failed to extract preview from message", e);
return "";
}
}
/**
* Removes ZERO WIDTH NON-JOINER characters
*
* @param text Text to check
* @return Checked text
*/
private String removeZeroWidthNonJoiner(String text) {
return text.replaceAll("\\u200C", "");
}
} {code}
Do you think this is an issue with the Preview class or the
JsoupHtmlTextExtractor class? Based on your response, I will open another issue
for improvement.
> Сonfigurable length of the preview value
> ----------------------------------------
>
> Key: JAMES-4049
> URL: https://issues.apache.org/jira/browse/JAMES-4049
> Project: James Server
> Issue Type: Improvement
> Components: data
> Affects Versions: 3.6.0
> Reporter: aleksey
> Priority: Minor
> Attachments: Message17199066550193266373.eml
>
>
> Hello, dear friends! I am using your james-server-data-jmap project (v3.8.1)
> in my work. And it would be great if you allow customization of
> {code:java}
> org.apache.james.jmap.api.model.Preview {code}
> That is, if you replace the constant
> {code:java}
> private static final int MAX_LENGTH = 256; {code}
> with the configurable field.
> Thank you for your work!!
>
>
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]