Hello SQLite community

        I am doing an investigation about Text Categorization based on
N-Grams at my University. The main goal of the project is to offer a
inexpensive way to categorize texts based on previous "learn". The
categorization could not only be done as the language of the text, my
hypothesis is that I could work well for categorize by subject of a
text. The project is a "alpha" based and is writing in PHP for many
reasons, I think the most important is that easier to write and debug
than C.

        The previous paragraph was only an introduction of what I want
to propose to the SQLite community, that was *not* advertisement or
spam.

        I think that SQLite is a good database, and very useful for
many task. I think the simple and effective architecture of SQLite
made it popular and useful for developers. For that reasons I was
thinking to write an extension for SQLite adding a Text Categorization
module to it. You may be wondering *why* or how that can be useful for
you?.

        I think I have the answer. Because n-gram based text
categorization works well for *long texts* and it is human-language
independent, that offer us (developers) a way to give to SQLite the
"knowledge" and "power" to categorize text automatically, of course
with a previous trine for every possible category with a set of
examples. This is useful for build a system for categorize articles of
a news-paper or organize a library.

        My experiments are not finish, but first of nothing I want to
know if is this a good idea and of course, if it useful for SQLite
community.

-- 
Cesar D. Rodas
http://www.cesarodas.com/
Mobile Phone: 595 961 974165
Phone: 595 21 645590
[EMAIL PROTECTED]
[EMAIL PROTECTED]

-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Reply via email to