Hi,
I'm trying to use pytables to do fast queries on text corpora. These
are plain text files with millions of sentences. Because the sentences
are not only in English but in other languages, like Spanish, I need
to also use utf-8 encoding. A simple table description should look
like this:

token: variable-length unicode
sentence: variable-length unicode

The 'token' column should have an index to allow fast queries on it. I
don't even need to modify the table once it's loaded. All I care about
is the querying speed.

Maybe I'm missing something obvious here, but in the docs it seems
that the only way to use unicode in pytables is by the VLUnicodeAtom
class, but if I use this in the description of a createTable call I
get the following error:

TypeError: Passing an incorrect value to a table column. Expected a
Col (or subclass) instance and got: "VLUnicodeAtom()". Please make use
of the Col(), or descendant, constructor to properly initialize
columns.

Could you please provide a skeleton for how pytables work with such a table?
Thanks!
-- 
 Hector

------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to