Re: [GENERAL] Full Text Search 101?
Hi Jonathan; > > I posted to this forum once before and was able to receive help. > Thanks again! > > I'm trying to implement full text search capabilities. Basically, I > have a very simple "data catalog" type of website (http:// > gis.drcog.org/datacatalog), where the user can type in a word or words > to search for records that match the criteria. I have a table with a > few fields, but I want to be able to search/index two fields -- the > "name" of the dataset and the "description." (or more, if I can index > fields from other tables too, that would be great). I'd like to be > able to use a full text search to rank the results in terms of > relevance. You can set up an index per table, but I've found that having a single vector table with bridge tables to the data I want to search works well. This would be in the style of: create table vectors ( id serial primary key, vector tsvector not null ); create table interesting_data ( id serial primary key, textual text not null ); CREATE TABLE vector_to_interesting ( v_id int not null references vectors(id), i_id int not null references interesting_data(id) ); And then SELECT id.* FROM interesting_data itd, vectors v, vector_to_interesting itv WHERE v.id = itv.v_id AND itv.i_id = itd.id AND v.vector @@ to_tsquery('searchpattern'); Allowing for some procedures around that that return multiple row types, to the client software. > > What is the best way to handle this? I've read through the PostgreSQL > documentation and don't quite understand it although I'm trying to > understand and am 'playing around' with this on a development server. > Is the default text search configuration enough? I've tried setting my > own configuration but get errors about not finding .dict dictionary > files? I have a default install of PostgreSQL 8.4. > > Also, I've created a tsvector column and created indexes, but it > didn't seem to delete stop words from the indexes. Should the stop > words be indexed? The default "english" configuration in PostgreSQL should have done this. Building your own configuration tends towards being a bit more advanced, and "english" should suit most needs. > > Also, I don't quite understand how to create indexes but rank certain > words as being more important than others, for instance, maybe having > words that come from the "name" column carrying more importance than > words coming from the "description" column. For part of this, there's ts_rank(tsvector, tsquery), as well as the relative weighting system in the textsearch modules. Unfortunately, I don't have much experience with the relative weighting, but, ts_rank() should get you partly there. I also found an interesting page on relative weighting @ http://www.postgresonline.com/special_feature.php?sf_name=postgresql83tsearch_cheatsheet&outputformat=html that might be useful for you. > Finally, I'd like > "watersheds" to come up when someone searches for "water" so I don't > think I have this configured properly, because this record is not > returned. Well, there's two ways to achieve this. The first is to manually the stopwords, which is getting into the custom configuration and a bit on the more advanced side. The other is that in PG8.4, you can do to_tsquery('water:*') which will tell the search parser to do a partial match, which would return "watershed", in this instance. My testing has shown it to be a bit slower (30ms, vs 15ms for a non-partial search), but not egregiously slow. > > Is there a good tutorial or maybe documentation that is a bit easier > to understand? I know my database is far from complicated but I can't > seem to find a configuration that works well. When I try ranking my > results, most of the results end up with the same rank, so something > must be wrong? > > Can someone point me in the right direction? > > Thanks for the help. I appreciate it. > Hope this helps. :) Regards, -- Aurynn Shaw The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564 ext 103 PostgreSQL Replication, Consulting, Custom Development, 24x7 support as...@commandprompt.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Dump all objects under a specific schema
Hi; Hello List, Is there a way that I can only dump all objects under a specific schema? I'd like to dump all tables, views' definition and data under a specific schema. pg_dump --help says that you're looking for the -n flag, so, pg_dump -n -f output.sql Hope that helps, Aurynn Shaw The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564 ext 103 PostgreSQL Replication, Consulting, Custom Development, 24x7 support [EMAIL PROTECTED] ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Mac OS X
I tried to install postgres onto my macbook via 'fink' and don't like it all that much. I decided to install from source, it's a fallback to my slackware days. But fink already created a user postgres and I can't seem to find anything to change it's configuration settings for shell, home director... If anyone has suggestions I would appreciate it. Also, does anyone know of a more current installation write-up for Mac other than what Apple provides? It's written around 7.4 and I'm not sure it's going to be optimal. Personally, I use the PG package available on http://www.entropy.ch/ software/macosx/postgresql/. It worked flawlessly, set up initdb, and configured PG to start at boot time. Thanks, Aurynn Shaw The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564 ext 103 PostgreSQL Replication, Consulting, Custom Development, 24x7 support [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] collision in serial numbers after INSERT?
Hi; Thanks Aurynn, but then I have another question -- Even if I do a 'SELECT nextval('your_sequence');', how do I prevent an insert from happening between me selecting the next serial value and then actually inserting it? It seems like I should lock the table if i want to be certain. SELECT nextval('your_sequence') updates the sequence as well, so the next transaction that calls SELECT nextval('your_sequence') will get your_return_value + 1. Once you SELECT nextval('your_sequence'), no other call to nextval will get the sequence number you were just given, barring an act such as using setval(). The logic would be akin to: SELECT nextval('your_sequence'); -- any amount of stuff can happen here, including other transactions that alter the sequence INSERT INTO your_table (serial_field, data) VALUES (sequence_value_you_selected, 'some data'); You can read more about how sequences work http://www.postgresql.org/docs/current/static/functions-sequence.html Hope that helps, Aurynn. On 6/1/07, Aurynn Shaw <[EMAIL PROTECTED] > wrote: > Hello all - > > I'm working on a site with PHP and Postgres, coming from a MySQL > background. > > I was looking for an equivalent to the mysql_insert_id() function, and > a site recommended this: > > function postg_insert_id($tablename, $fieldname) > { > global connection_id; > $result=pg_exec($connection_id, "SELECT last_value FROM $ {tablename}_ > ${fieldname}_seq"); > $seq_array=pg_fetch_row($result, 0); > return $seq_array[0]; > } > > It relies on pg's sequencing ability. > > However, I wondered, if I were in an environment where there were many > concurrent inserts, would it be possible that I didn't get the serial > number of the insert that *I* just did? That if I do an insert, and > several inserts happen after mine, wouldn't I get the id of the latest > row, which is several inserts after mine? Everything that deals with sequences happens outside of transactions, so this could theoretically happen. The usual way to avoid this is to do: SELECT nextval('your_sequence'); Then do your insert with that in the serial field. Hope that helps, Aurynn Shaw The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564 ext 103 PostgreSQL Replication, Consulting, Custom Development, 24x7 support [EMAIL PROTECTED] -- "Computers are useless. They can only give you answers" -- Pablo Picasso Aurynn Shaw The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564 ext 103 PostgreSQL Replication, Consulting, Custom Development, 24x7 support [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] On-line / off-line trace of SQL statements presented to the Postgres SQL engine
We are using a three-tier application with J2EE, JBoss, Hibernate and a Postgres database. It would be a nice thing to monitor or trace the actual SQL statements processed by the DB. I do not really need the result set as I can get this - if required - using SQL against the DB through the Pgsql interface. You can configure Postgres to give you this information in the logfiles, or before you execute any queries on your connection. The configuration option in postgresql.conf is, I believe, log_statement. Alternatively, you can run SET log_statement to 'all'; on your JDBC connection before your application begins to query the database. http://www.postgresql.org/docs/8.2/static/runtime-config-logging.html Hope that helps. Aurynn Shaw The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564 PostgreSQL Replication, Consulting, Custom Development, 24x7 support [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly