[
https://issues.apache.org/jira/browse/CODEC-132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224729#comment-13224729
]
Thomas Neidhart commented on CODEC-132:
---------------------------------------
@Matthew: thanks for your feedback, I had some experience with similar rule
base systems before, and knew that they can become very fragile with unforeseen
input as the number of rules grows (especially generic ones). Anyway, your code
was easy to debug, and nice to read!
@Robert: your tests make perfect sense imo, thanks for reporting back.
@Gary: thanks for applying the patch
> BeiderMorseEncoder OOM issues
> -----------------------------
>
> Key: CODEC-132
> URL: https://issues.apache.org/jira/browse/CODEC-132
> Project: Commons Codec
> Issue Type: Bug
> Affects Versions: 1.6
> Reporter: Robert Muir
> Fix For: 1.6.1
>
> Attachments: CODEC-132.patch, CODEC-132_test.patch
>
>
> In Lucene/Solr, we integrated this encoder into the latest release.
> Our tests use a variety of random strings, and we have recent jenkins failures
> from some input streams (of length <= 10), using huge amounts of memory (e.g.
> > 64MB),
> resulting in OOM.
> I've created a test case (length is 30 here) that will OOM with -Xmx256M.
> I haven't dug into this much as to what's causing it, but I suspect there
> might be a bug
> revolving around certain punctuation characters: we didn't see this happening
> until
> we beefed up our random string generation to start producing "html-like"
> strings.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira