ANN: WordSegment 0.6.1 Released

Grant Jenks Wed, 16 Sep 2015 00:36:13 -0700

Announcing the Release of WordSegment Version 0.6.1

What is WordSegment?
--------------------


WordSegment is an Apache2 licensed module for English word segmentation,
written in pure-Python, and based on a trillion-word corpus. Based on code from
the chapter “Natural Language Corpus Data” by Peter Norvig from the book
“Beautiful Data” (Segaran and Hammerbacher, 2009). Data files are derived from
the Google Web Trillion Word Corpus. It's implemented in pure-Python with 100%
code coverage and complete documentation.

What's new in 0.6.1?
--------------------

- Exposed TOTAL constant representing the count of all unigrams in the corpus.
  Defaults to 1,024,908,267,229.
- Added documentation on how to use a different corpus:
  http://www.grantjenks.com/docs/wordsegment/using-a-different-corpus.html

Links
-----

- Documentation: http://www.grantjenks.com/docs/wordsegment/
- Download: https://pypi.python.org/pypi/wordsegment
- Source: https://github.com/grantjenks/wordsegment
- Issues: https://github.com/grantjenks/wordsegment/issues

This release is backwards-compatible. Please upgrade.
-- 
https://mail.python.org/mailman/listinfo/python-announce-list

        Support the Python Software Foundation:
        http://www.python.org/psf/donations/

ANN: WordSegment 0.6.1 Released

Reply via email to