1. install opennlp in Debian or a derivative distribution. 2. download the English model at https://www.apache.org/dyn/closer.cgi/opennlp/models/ud-models-1.2/opennlp-en-ud-ewt-sentence-1.2-2.5.0.bin 3. pipe your text through the following code (I use that in vim with the ":'<,'>!" prefix to convert the selected text):
tr \\n " " | java -Dorg.slf4j.simpleLogger.logFile=System.out -cp /usr/share/java/opennlp-tools-2.5.3.jar:/usr/share/java/slf4j-api.jar:/usr/share/java/slf4j-simple.jar opennlp.tools.cmdline.CLI SentenceDetector ~/Downloads/opennlp-en-ud-ewt-sentence-1.2-2.5.0.bin | grep SentenceDetectorTool | sed "s/.*SentenceDetectorTool - \(.*\)/\1/" Adapting that as a git filter is left to the reader. On 4/24/25 9:05 AM, Marc Petit-Huguenin wrote: > On 4/24/25 8:53 AM, Larry Masinter wrote: >> comparing sources is a separate workflow step. GitHub and git support >> custom diff filters that are applied, not to edit the source but to make >> the diff more meaningful. >> Rather than talking about modifying the source and the work of everyone to >> maintain some conventions like NSNL, make a git diff filter that produces >> NSNL for those who want that when examining diffs. > > Ah, I have been working on something like this morning. The issue is that > just using regex does not really work, one needs to use NLP. I think the > simplest at this point is to write a small wrapper on top of opennlp (at > least for OLPS). Stay tuned. > >> https://LarryMasinter.net <https://larrymasinter.net/> >> https://interlisp.org >> >> >> On Thu, Apr 24, 2025 at 8:03 AM Ted Lemon <mel...@fugue.com> wrote: >> >>> On Thu, Apr 24, 2025, at 6:13 AM, Michael Richardson wrote: >>> >>> The question remains, when the RPC edits text, whether XML or kramdown, >>> ought they do a do-nothing pass where they change to NSNL. >>> Assume that there is a tool to do this. >>> If not, ought they at least attempt NSNL for any changes that *they* make. >>> {I'd really like that part} >>> >>> >>> Good god no. Why would you gratuitously make a change that would affect >>> all subsequent diffs? Someone did this to the mDNSResponder sources back in >>> ~2005 and now you can't do git blame on anything prior to that. Please, no >>> gratuitous formatting changes to the source code. >>> > -- Marc Petit-Huguenin Email: m...@petit-huguenin.org Blog: https://medium.com/@petithug Profile: https://www.linkedin.com/in/petithug
OpenPGP_signature.asc
Description: OpenPGP digital signature
_______________________________________________ rfc-interest mailing list -- rfc-interest@rfc-editor.org To unsubscribe send an email to rfc-interest-le...@rfc-editor.org