[ 
https://issues.apache.org/jira/browse/WHIMSY-404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17752194#comment-17752194
 ] 

Sebb commented on WHIMSY-404:
-----------------------------

The most recent case was caused by the use of characters encoded in a scheme 
other than UTF-8.
They were correct in their encoding, but not in UTF-8.

As such, the characters needed replacement not deletion.

Given that this hardly every occurs, I am not sure that it is worth the effort 
of doing anything other than replacing the bad characters with the relevant 
marker and flagging up that the agenda needs to be corrected at some point.
It is easy enough to search for this character in the source, and usually the 
correct replacement will be obvious (all of the recent issues were caused by 
accents in names, apart from one spurious symbol).

There is no way of knowing for sure what the original encoding of each 
character was, so in general it is not possible to automate replacement. Indeed 
if several edits are made outside Whimsy, multiple encodings may be involved. 
If this was a frequent occurrence it might be worth trying to develop some 
heuristics to automate the process, but I suspect that would be quite a lot of 
effort, and not guaranteed to work.

Also note that the Whimsy agenda app is slated to be retired at some point, so 
non-essential work on it is likely to be wasted effort.

I think the minimum that needs to be done is to replace invalid characters with 
the standard unknown character marker - � - so that edits no longer cause a 
crash.

It would be sensible to provide some sort of notification that this has been 
necessary so the necessary manual correction(s) can be made at some point.

> Can't post project report.
> --------------------------
>
>                 Key: WHIMSY-404
>                 URL: https://issues.apache.org/jira/browse/WHIMSY-404
>             Project: Whimsy
>          Issue Type: Bug
>          Components: BoardAgenda
>            Reporter: James Bognar
>            Priority: Major
>
> Trying to save project report gives this error message:
> Exception
> #<ArgumentError: invalid byte sequence in UTF-8>
> Here's the report we're trying to submit.
> https://whimsy.apache.org/board/agenda/2023-08-16/Juneau



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to