1. install opennlp in Debian or a derivative distribution.
2. download the English model at 
https://www.apache.org/dyn/closer.cgi/opennlp/models/ud-models-1.2/opennlp-en-ud-ewt-sentence-1.2-2.5.0.bin
3. pipe your text through the following code (I use that in vim with the 
":'<,'>!" prefix to convert the selected text):

tr  \\n " " | java -Dorg.slf4j.simpleLogger.logFile=System.out -cp 
/usr/share/java/opennlp-tools-2.5.3.jar:/usr/share/java/slf4j-api.jar:/usr/share/java/slf4j-simple.jar
 opennlp.tools.cmdline.CLI SentenceDetector 
~/Downloads/opennlp-en-ud-ewt-sentence-1.2-2.5.0.bin | grep 
SentenceDetectorTool | sed "s/.*SentenceDetectorTool - \(.*\)/\1/"

Adapting that as a git filter is left to the reader.

On 4/24/25 9:05 AM, Marc Petit-Huguenin wrote:
> On 4/24/25 8:53 AM, Larry Masinter wrote:
>> comparing sources is a separate workflow step. GitHub and git support
>> custom diff filters that are applied, not to edit the source but to make
>> the diff more meaningful.
>> Rather than talking about modifying the source and the work of everyone to
>> maintain some conventions like NSNL, make a git diff filter that produces
>> NSNL for those who want that when examining diffs.
> 
> Ah, I have been working on something like this morning.  The issue is that 
> just using regex does not really work, one needs to use NLP.  I think the 
> simplest at this point is to write a small wrapper on top of opennlp (at 
> least for OLPS).  Stay tuned.
> 
>> https://LarryMasinter.net <https://larrymasinter.net/>
>> https://interlisp.org
>>
>>
>> On Thu, Apr 24, 2025 at 8:03 AM Ted Lemon <mel...@fugue.com> wrote:
>>
>>> On Thu, Apr 24, 2025, at 6:13 AM, Michael Richardson wrote:
>>>
>>> The question remains, when the RPC edits text, whether XML or kramdown,
>>> ought they do a do-nothing pass where they change to NSNL.
>>> Assume that there is a tool to do this.
>>> If not, ought they at least attempt NSNL for any changes that *they* make.
>>> {I'd really like that part}
>>>
>>>
>>> Good god no. Why would you gratuitously make a change that would affect
>>> all subsequent diffs? Someone did this to the mDNSResponder sources back in
>>> ~2005 and now you can't do git blame on anything prior to that. Please, no
>>> gratuitous formatting changes to the source code.
>>>
> 


-- 
Marc Petit-Huguenin
Email: m...@petit-huguenin.org
Blog: https://medium.com/@petithug
Profile: https://www.linkedin.com/in/petithug

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

_______________________________________________
rfc-interest mailing list -- rfc-interest@rfc-editor.org
To unsubscribe send an email to rfc-interest-le...@rfc-editor.org

Reply via email to