[issue2027] Module containing C implementations of common text algorithms

2010-09-20 Thread Matthew Barnett

Matthew Barnett  added the comment:

I've started on a module called 'texttools'. So far it has Levenshtein and 
Porter (both coded in C).

If there's interest I'll put it on PyPI.

Suggestions for other additions?

--
nosy: +mrabarnett

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2027] Module containing C implementations of common text algorithms

2010-09-20 Thread Benjamin Peterson

Changes by Benjamin Peterson :


--
resolution:  -> rejected
status: pending -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2027] Module containing C implementations of common text algorithms

2010-09-20 Thread Mark Lawrence

Mark Lawrence  added the comment:

I'll close this as suggested in msg106281 in a couple of weeks unless someone 
objects.

--
nosy: +BreamoreBoy
status: open -> pending

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2027] Module containing C implementations of common text algorithms

2010-05-21 Thread STINNER Victor

STINNER Victor  added the comment:

Before having a optimized version of common test algorithms, why not starting 
by a Python? Write and maintain C code is harder, and I'm not sure that 
performances are critical for such algorithm.

This issue has no patch: if nobody provides a patch, I will close it because I 
agree with Amaury and Christian (this issue can be solved by an 3rd party 
module: such module can be written in C).

--
nosy: +haypo

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2027] Module containing C implementations of common text algorithms

2009-05-16 Thread Daniel Diniz

Changes by Daniel Diniz :


--
components: +Extension Modules, Unicode -Library (Lib)
priority: normal -> low
stage:  -> test needed
versions: +Python 2.7, Python 3.2 -Python 2.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2027] Module containing C implementations of common text algorithms

2008-02-08 Thread Georg Brandl

Georg Brandl added the comment:

Even PHP includes Levenshtein... ;)

--
nosy: +georg.brandl

__
Tracker <[EMAIL PROTECTED]>

__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2027] Module containing C implementations of common text algorithms

2008-02-07 Thread Matt Chaput

Matt Chaput added the comment:

The Porter stemming and Levenshtein edit-distance algorithms are not
"fast-moving" nor are they fusion reactors... they've been around
forever, and are simple to implement, but are still useful in various
common scenarios. I'd say this is similar to Python including an
implementation of digest functions such as SHA: it's useful enough, and
compute-intensive enough, to warrant a C implementation. Shipping C
extensions is not an option for everyone; it's especially a pain with
Windows.

__
Tracker <[EMAIL PROTECTED]>

__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2027] Module containing C implementations of common text algorithms

2008-02-07 Thread Christian Heimes

Christian Heimes added the comment:

I agree with Amaury. Pyhton uses the slogan "batteries included" and not
"fusion reactor included". We can and will never include every library
that may be useful for some users. Python core's development cycles are
too slow for fast moving software. Andreas' TXNG3 contains fine
implementations for stemming and levenstein.

--
nosy: +tiran
priority:  -> normal
versions: +Python 2.6

__
Tracker <[EMAIL PROTECTED]>

__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2027] Module containing C implementations of common text algorithms

2008-02-07 Thread Amaury Forgeot d'Arc

Amaury Forgeot d'Arc added the comment:

I don't think that this should be part of the core standard library.
Did you look at the TextIndexNG project?
http://opensource.zopyx.com/projects/TextIndexNG3/

--
nosy: +amaury.forgeotdarc

__
Tracker <[EMAIL PROTECTED]>

__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2027] Module containing C implementations of common text algorithms

2008-02-06 Thread Matt Chaput

New submission from Matt Chaput:

Add a module to the standard library containing fast (C) implementations 
of common text/language related algorithms, to begin specifically Porter 
(and perhaps other) stemming and Levenshtein (and perhaps other) edit 
distance. Both these algorithms are useful in multiple domains, well 
known and understood, and have sample implementations all over the Web, 
but are compute-intensive and prohibitively expensive when implemented 
in pure Python.

--
components: Library (Lib)
messages: 62134
nosy: mchaput
severity: normal
status: open
title: Module containing C implementations of common text algorithms
type: rfe

__
Tracker <[EMAIL PROTECTED]>

__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com