[PERFORM] Hi!
Hi, I am trying install the postgresql-7.4.3 simple installation. I did ./configure command at the postgresql directory source. While the configuring proccess I receiving the follow message: checking for tar... /bin/tarchecking for strip... stripchecking whether it is possible to strip libraries... yeschecking for bison... bison -y*** The installed version of Bison is too old. PostgreSQL needs*** Bison version 1.875 or later.checking for perl... /usr/bin/perlchecking for main in -lbsd... nochecking for setproctitle in -lutil... nochecking for main in -lm... yes But, after this message the install proccess continue like this message. The problem is that the installation never finish. I am thinking that the configure proccess is in loop. Have it anything relation with my hardware configuration? The computer where I did this is: AMD K6-II 200 MHZ; 64 MB memory;I would like why the configure proccess never finish. Regards, Janio
[PERFORM] Indexing question
Hello all, I have a rather large table (~20 GB) of network logs taken over the period of a month and stored under postgres 7.3. I'm trying to create an indexing scheme so that I can do a query with a where time foo and time bar and get the results without too much waiting. With this in mind, I created a new database ordering each log entry by the time values (seconds, and microseconds), and then created an index on the time value for seconds. I didn't use the clusterdb command, but I did do a select * into foo_tbl order by time, which I understand to be the functional equivalent. The result looks something like this: Table public.allflow_tv_mydoom Column | Type | Modifiers +-+--- tv_s | bigint | tv_us | bigint | src| inet| dst| inet| sport | integer | dport | integer | bytes_in | bigint | bytes_out | bigint | isn_in | bigint | isn_out| bigint | num_starts | integer | result | integer | flags | integer | age_s | bigint | age_us | bigint | ttl_left | bigint | end_s | bigint | end_us | bigint | Indexes: allflow_tv_mydoom_x btree (tv_s) with tv_s, of course, being my value in seconds. I followed this up with an ANALYZE so postrgres could absorb this structure into its logic. However, whenever I want to do my query, it *insists* on doing a sequential scan, even when I explicitly turn sequential scan off in postgres.conf... here's an example: standb=# explain select * from allflow_tv_mydoom where tv_s 1074200099 and tv_s 107506499; QUERY PLAN --- Seq Scan on allflow_tv_mydoom (cost=1.00..102303307.94 rows=1 width=132) Filter: ((tv_s 1074200099) AND (tv_s 107506499)) (2 rows) In this query I'm basically asking for a day's worth of data. It should be straightforward: it's all laid out in order and indexed, but it still insists on (what I believe to be) tearing through the entire DB and filtering out the other time values as it sees them. Even if I want just the data from a particular second, it does the same thing: standb=# explain select * from allflow_tv_mydoom where tv_s = 1074200099; QUERY PLAN - Seq Scan on allflow_tv_mydoom (cost=1.00..102140682.45 rows=145 width=132) Filter: (tv_s = 1074200099) (2 rows) Naturally, this is incredibly frustrating because it takes forever, regardless of the query. The funny thing though is that I have the same data in another table where it is ordered and indexed by IP address, and the queries use the index and work the way I want them to. Example: standb=# explain select * from flow_ip_mydoom where src = '10.0.5.5'; QUERY PLAN - Index Scan using flow_ip_mydoom_x on flow_ip_mydoom (cost=0.00..333.41 rows=376 width=132) Index Cond: (src = '10.0.5.5'::inet) (2 rows) Can someone please explain to me what's going on and how to fix it? If there's an easier way to 'jump' to different time regions in the table without indexing all the different seconds values, I'd like to know this as well. Thanks a bunch, -S ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] 50 000 000 Table entries and I have to do a keyword search HELP NEEDED
Title: Message The words for the keyword can be made up of a sentace, ie 10 or more keywords to one entry. Also incase I didnt answer before, we are using TSearch2 and all tables have been fully analyzed and indexed. Any other suggestions? How long do searches take when 10 000 rows are returned? We can not use a limit of 100 because we need to analyze the entire data set returned. Thanks, - Original Message - From: Jeremy Dunn To: 'borajetta' Cc: Postgresql Performance Sent: Tuesday, June 15, 2004 5:43 AM Subject: RE: [PERFORM] 50 000 000 Table entries and I have to do a keyword search HELP NEEDED One option that does not take advantage of any fancy indexing methods is to create a trigger on the table, on insert/update/delete, which extracts each individual word from the field you care about, and creates an entry in another 'keyword' table, id = 'word', value = pk of your original table. then index the keyword table on the 'keyword' field, and do your searches from there. this should improve performance substantially, even on very large return sets, because the keyword table rows are very small and thus a lot of them fit in a disk block. - Jeremy -Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of borajettaSent: Monday, June 07, 2004 5:47 PMTo: [EMAIL PROTECTED]Subject: [PERFORM] 50 000 000 Table entries and I have to do a keyword search HELP NEEDED So I have a table with about 50 million entries in it, I have to do a keyword search. The keyword search is done on the title of the entry. For example a entry could be "This is a title string which could be searched" I have tried a few ways to search but I get horrible search times. Some keywords will come up with matches as big as .25 million but most are around 1000-5000. I use an index which narrows the table down to about 1.5-2million entries. I used 2 tables which had a 1:1 correspondence. One held a gist index which was on a int field which searched the for the keyword. Then I would join the table to another to retrieve the rest of the information about the items it matched. This was slow even for returning 100 entries. About 10 seconds, sometimes 5. But when I start getting 1xxx entries its about 30-50 seconds. The rest is just horrible. How should I set up my indexes and or tables. We were thinking of putting the index inside one table then the join would not have to be done but this still returns rather slow results. I have not fully tested this method but it looks like when run for just the keyword search on the title and no joining it can return in about 10 seconds or less. This is a great improvement but I am currently going to make the table all in one and see how long it will take. I believe it will not be much more as there will be no join needed only the returning of some attribute fields. This is still not the kind of time I would like to see, I wanted something around 2 seconds or less. I know there is a lot of information especially if .25 million rows are to be returned but if there is only 1xxx-9xxx rows to be returned I believe 2 seconds seems about right. How do search engines do it? Any suggestions are welcome, Thanks
Re: [PERFORM] Indexing question
Stan Bielski [EMAIL PROTECTED] writes: Table public.allflow_tv_mydoom Column | Type | Modifiers +-+--- tv_s | bigint | ^^ Indexes: allflow_tv_mydoom_x btree (tv_s) standb=# explain select * from allflow_tv_mydoom where tv_s 1074200099 and tv_s 107506499; [ gives seqscan ] This is a FAQ :-(. Unadorned integer constants are taken to be int4 not int8 (unless they are too large for int4), and cross-data-type comparisons are not indexable in existing releases. So you have to explicitly cast the comparison values to int8: explain select * from allflow_tv_mydoom where tv_s 1074200099::bigint and tv_s 107506499::bigint; (or use the standard CAST syntax if you prefer). 7.5 will have a fix for this ancient annoyance. BTW, is there a reason to be using tv_s+tv_us and not just a single timestamptz column? regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend