Re: [Pytables-users] pytables for text corpora

2012-01-23 Thread Francesc Alted
2012/1/23 Hector > On Mon, Jan 23, 2012 at 3:01 AM, Francesc Alted > wrote: > > 2012/1/23 Hector > >> > >> I guess the only reason why I'm thinking of a Table is to be able to > >> create an index on one of its columns and then be able to do fast > >> queries. If I could also do fast queries on

Re: [Pytables-users] pytables for text corpora

2012-01-23 Thread Hector
On Mon, Jan 23, 2012 at 3:01 AM, Francesc Alted wrote: > 2012/1/23 Hector >> >> I guess the only reason why I'm thinking of a Table is to be able to >> create an index on one of its columns and then be able to do fast >> queries. If I could also do fast queries on a VLArray that would be >> great

Re: [Pytables-users] pytables for text corpora

2012-01-23 Thread Francesc Alted
2012/1/23 Hector > I guess the only reason why I'm thinking of a Table is to be able to > create an index on one of its columns and then be able to do fast > queries. If I could also do fast queries on a VLArray that would be > great too! Is this actually possible? > Ok. What if you use a VLArr

Re: [Pytables-users] pytables for text corpora

2012-01-22 Thread Hector
On Mon, Jan 23, 2012 at 2:32 AM, Francesc Alted wrote: > > Unfortunately, Unicode columns in Table objects are not supported. Oh, I see. >Storing sentences in a VLArray seems like a more sensible approach.  Could you >tell > us why are you after using a Table? > -- > Francesc Alted I gues

Re: [Pytables-users] pytables for text corpora

2012-01-22 Thread Francesc Alted
Hola Hector, 2012/1/23 Hector > Hi, > I'm trying to use pytables to do fast queries on text corpora. These > are plain text files with millions of sentences. Because the sentences > are not only in English but in other languages, like Spanish, I need > to also use utf-8 encoding. A simple table

[Pytables-users] pytables for text corpora

2012-01-22 Thread Hector
Hi, I'm trying to use pytables to do fast queries on text corpora. These are plain text files with millions of sentences. Because the sentences are not only in English but in other languages, like Spanish, I need to also use utf-8 encoding. A simple table description should look like this: token: