Hi, I'm trying to use pytables to do fast queries on text corpora. These are plain text files with millions of sentences. Because the sentences are not only in English but in other languages, like Spanish, I need to also use utf-8 encoding. A simple table description should look like this:
token: variable-length unicode sentence: variable-length unicode The 'token' column should have an index to allow fast queries on it. I don't even need to modify the table once it's loaded. All I care about is the querying speed. Maybe I'm missing something obvious here, but in the docs it seems that the only way to use unicode in pytables is by the VLUnicodeAtom class, but if I use this in the description of a createTable call I get the following error: TypeError: Passing an incorrect value to a table column. Expected a Col (or subclass) instance and got: "VLUnicodeAtom()". Please make use of the Col(), or descendant, constructor to properly initialize columns. Could you please provide a skeleton for how pytables work with such a table? Thanks! -- Hector ------------------------------------------------------------------------------ Try before you buy = See our experts in action! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-dev2 _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users