On Fri, 19 Nov 2004 14:38:20 +0200, Hannu Krosing wrote: >> Part of my current code concerns packing DNA characters: As the alphabet >> of DNA strings is very small (four characters), it seems like a >> straigt-forward optimization to store each character in two bits. > > My advice would be to get it to work first, oprimize later.
Valid point. However, I needed something rather basic to work on, to get to know C and to get to know PostgreSQL in a user defined type context. But if packing proves to be a problem when implementing the interesting stuff, then thanks&yes: Packing should be an afterthought. >> My first and most immediate goal is to support efficient answering of a >> question like "which rows contain the sequence TTGACCACTTG in column foo?". > > If you store your sequences as strings, you may try to use trigrams (or > modify them to 4,5,6 or 7-grams ;) to get some feel how that works. > > trigram module is in contrib/pg_trgm. (/me Printing readme.) Thanks. -- Greetings from Troels Arvin, Copenhagen, Denmark ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org