Re: [SQL] Comparing tsearch2 vectors.
O kyrios Rajesh Kumar Mallah egrapse stis Jul 13, 2004 :
> Achilleus Mantzios wrote:
>
> >O kyrios Rajesh Kumar Mallah egrapse stis Jul 12, 2004 :
> >
> >
> >
> >>Achilleus Mantzios wrote:
> >>
> >>
> >>
> >>>O kyrios Rajesh Kumar Mallah egrapse stis Jul 12, 2004 :
> >>>
> >>>
> >>>
> >>>
> >>>
> Dear Mantzios,
>
> I have to get set of banners from database in
> response to a search term. I want that the search term
> be compared to the keyword corresponding to the
> banners stored in database. current i am doing an
> equality match but i woild like to do it after stemming
> both the sides (serch term and keywords).
>
>
>
>
> >>>You could transform your search terms so that there is the "&"
> >>>separator between them. (& stands for "AND").
> >>>E.g. "handicrafts exporter" becomes "handicrafts&exporter"
> >>>And then
> >>>select * from where idxfti @@ to_tsquery();
> >>>
> >>>
> >>>
> >>>
> >>But i do not want 'handicraft exporters of delhi' to pop out if i search
> >>for 'handicrafts exporters' whereas
> >>
> >>SELECT to_tsvector('handycrafts exporters of delhi') @@
> >>to_tsquery('handycraft&exporting');
> >>
> >>will be true.
> >>
> >>
> >
> >Define what you want, and then read tsearch2 userguide.
> >I'm sure you'll find your way :)
> >
> >
> The requirement is different than full text search.
> I am not searching a word in a collection of words (text)
> rather comparing two strings after all the words in those
> strings are stemmed. Hope my requirement is clear now.
Ok, so we drop back to the initial assumption.
Tokenize both strings into an array of strings.
Let them be String[] string1,String[] string2
If arrays are not of same length then they are not equal.
Otherwise for each i in string1 compare
lexize(,string1[i]) against
lexize(,string2[i])
The tokenization is your job, while the lexize function comes with
tsearch2.
I dont know if its possible to be done in sql, since it requires some sort
of iteration.
>
>
> Regds
> mallah.
>
>
>
>
> >
> >
> >>Regds
> >>Mallah.
> >>
> >>
> >>
> >>
> >>
> >>>where idxfti is your tsvector column.
> >>>
> >>>E.g.
> >>># SELECT to_tsvector('handycrafts exporters') @@
> >>>to_tsquery('handycraft&exporting');
> >>>?column?
> >>>--
> >>>t
> >>>(1 row)
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> So that the banners for the adword say 'incense exporter' is
> shown even if 'incenses exporter' or 'incense exporters' is
> searched.
>
> I hope i am able to clarify.
>
> Regds
> Mallah.
>
> Achilleus Mantzios wrote:
>
>
>
>
>
> >O kyrios Rajesh Kumar Mallah egrapse stis Jul 12, 2004 :
> >
> >
> >
> >
> >
> >
> >
> >>Hi,
> >>
> >>We want to compare strings after stemming. Can anyone
> >>tell me what is the best method. I was thinking to compare
> >>the tsvector ,but there is no operator for that.
> >>
> >>
> >>
> >>
> >>
> >>
> >I'd tokenize each string and then apply lexize() to get the
> >equivalent stemified
> >word, but what exactly are you trying to accomplish?
> >
> >
> >
> >
> >
> >
> >
> >>Regds
> >>Mallah.
> >>
> >>
> >>
> >>tradein_clients=# SELECT to_tsvector('handicraft exporters');
> >>+---+
> >>|to_tsvector|
> >>+---+
> >>| 'export':2 'handicraft':1 |
> >>+---+
> >>(1 row)
> >>
> >>Time: 710.315 ms
> >>tradein_clients=#
> >>tradein_clients=# SELECT to_tsvector('handicrafts exporter');
> >>+---+
> >>|to_tsvector|
> >>+---+
> >>| 'export':2 'handicraft':1 |
> >>+---+
> >>(1 row)
> >>
> >>Time: 400.679 ms
> >>tradein_clients=# SELECT to_tsvector('Hi there') = to_tsvector('Hi there');
> >>ERROR: operator does not exist: tsvector = tsvector
> >>HINT: No operator matches the given name and argument type(s). You may
> >>need to add explicit type casts.
> >>tradein_clients=#
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >
> >
> >
> >
>
>
>
>
> >>>
> >>>
> >>>
> >>>
> >>
> >>
> >>
> >
> >
> >
>
>
>
--
-Achilleus
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [SQL] [PHP] Secure DB Systems - How to
> Keeping the system administrator from seeing the data while making it
> searchable is difficult. To do this you need to encrypt the data on
> the client side using a key the client has (and this key has to be
> protected from loss) and the only searches you can do are equality
> searches using a hash or encrypted value.
You can also perform regex searches.
Here is an example to get you started:
CREATE TABLE crypto (
id SERIAL PRIMARY KEY,
title VARCHAR(50),
crypted_contentBYTEA
);
INSERT INTO crypto VALUES
(1,'test1',encrypt_iv('daniel','fooz',decode('9MlPeZtpuxKo5m4O4+pd4g==','base64'),'aes'));
INSERT INTO crypto VALUES
(2,'test2',encrypt_iv('test','fooz',decode('9MlPeZtpuxKo5m4O4+pd4g==','base64'),'aes'));
INSERT INTO crypto VALUES
(3,'test3',encrypt_iv('struck','fooz',decode('9MlPeZtpuxKo5m4O4+pd4g==','base64'),'aes'));
SELECT *,decrypt_iv(crypted_content,
'fooz',decode('9MlPeZtpuxKo5m4O4+pd4g==','base64'), 'aes') FROM crypto;
-- equality search
SELECT
*,decrypt_iv(crypted_content,'fooz',decode('9MlPeZtpuxKo5m4O4+pd4g==','base64'),'aes')
FROM crypto WHERE decrypt_iv(crypted_content,
'fooz',decode('9MlPeZtpuxKo5m4O4+pd4g==','base64'), 'aes')='struck';
-- regex search
SELECT
*,decrypt_iv(crypted_content,'fooz',decode('9MlPeZtpuxKo5m4O4+pd4g==','base64'),'aes')
FROM crypto WHERE encode(decrypt_iv(crypted_content,
'fooz',decode('9MlPeZtpuxKo5m4O4+pd4g==','base64'), 'aes'),'escape')
~* 'daniel';
"fooz" is the password and "9MlPeZtpuxKo5m4O4+pd4g==" is the IV (initialization
vector) stored in base64 format. I choose base64 because it is more convenient to
create queries with it.
In the real database I do use a different IV for every row, so I do also store the IV
with the row.
In my case I do generate the IV by PHP with /dev/urandom as a random source.
Greetings,
Daniel Struck
--
Retrovirology Laboratory Luxembourg
Centre Hospitalier de Luxembourg
4, rue E. Barblé
L-1210 Luxembourg
phone: +352-44116105
fax: +352-44116113
web: http://www.retrovirology.lu
e-mail: [EMAIL PROTECTED]
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match
Re: [SQL] [PHP] Secure DB Systems - How to
On Tue, Jul 13, 2004 at 11:35:57 +0200, Daniel Struck <[EMAIL PROTECTED]> wrote: > > Keeping the system administrator from seeing the data while making it > > searchable is difficult. To do this you need to encrypt the data on > > the client side using a key the client has (and this key has to be > > protected from loss) and the only searches you can do are equality > > searches using a hash or encrypted value. > > You can also perform regex searches. If you decrypt the data on the database, the sysadmin can see it. If you are willing to take that chance (e.g. if you primary concern is some third party getting a snapshot of the DB), then you can do lots of things. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] [PHP] Secure DB Systems - How to
> If you decrypt the data on the database, the sysadmin can see it. Hm, you are right. If one does decrypt the data on the database you have to sent the password to postgresql and so a administrator of the database could easily grasb the password. So the only way to go, would be to perform en/decryption on the client side? > If you are willing to take that chance (e.g. if you primary concern is > some third party getting a snapshot of the DB), then you can do lots of > things. I wonder now; if somebody could achieve to get a snapshot of the database, they could also be able to get the log-file of postgresql. So one would also have to make attention that the information like sql statements don't leak that way. Are there other places where this kind of information could leak? Greetings, Daniel Struck -- Retrovirology Laboratory Luxembourg Centre Hospitalier de Luxembourg 4, rue E. Barblé L-1210 Luxembourg phone: +352-44116105 fax: +352-44116113 web: http://www.retrovirology.lu e-mail: [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] Constraint->function dependency and dump in 7.3
Thanks Rod. Checked, and So do you say, this problem persists in dbs dumped from 7.4 to 7.4 too? i.e. upgrading to 7.4 (which we tested for quite some time now) won't help? (...) trying dump confirmed this :( Even tried adding a line to pg_depend but didn't seem to change anything. G. %--- cut here ---% \end - Original Message - From: "Rod Taylor" <[EMAIL PROTECTED]> Sent: Monday, July 12, 2004 2:51 PM Subject: Re: [SQL] Constraint->function dependency and dump in 7.3 > On Mon, 2004-07-12 at 08:29, SZŰCS Gábor wrote: > > Dear Fellow Countymen, > > > > I fear none of your answers are acceptable for me. Also, from Csaba's > > answer, this sounds to me more like a bug in pg_dump v7.3. > > It is a bug (or a missing feature) that should be solved with v7.5 for > restoring to 7.5 or later database. > > Until then, you'll need to re-order the dump by hand (splitting data > from schema can be useful). > > ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [SQL] Constraint->function dependency and dump in 7.3
> Checked, and So do you say, this problem persists in dbs dumped from 7.4 to > 7.4 too? i.e. upgrading to 7.4 (which we tested for quite some time now) > won't help? There may have been some minor fiddling to make it easier, but I wouldn't call it fixed by any means. > trying dump confirmed this :( Even tried adding a line to pg_depend but > didn't seem to change anything. The 7.5 version of pg_dump will be the first one to use pg_depend to attempt to get the object order correct. I would hold off on an upgrade until October (or so) and goto 7.5 instead. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [SQL] Constraint->function dependency and dump in 7.3
Dear Rod, Thanks. It'll be a pain to have two versions between the prod and devel servers, but I'll forward this info to the chief. Thanks again, G. %--- cut here ---% \end - Original Message - From: "Rod Taylor" <[EMAIL PROTECTED]> Sent: Tuesday, July 13, 2004 7:20 PM > The 7.5 version of pg_dump will be the first one to use pg_depend to > attempt to get the object order correct. > > I would hold off on an upgrade until October (or so) and goto 7.5 > instead. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[SQL] Certain COPY's ignored from dump?
Dear Gurus, Dumping from 7.3 to 7.4 seems to omit some of the COPY commands, without any warning or error. The table is created, there are notices about indexes in the log, but COPY doesn't seem to happen-- the table remains empty at the end of the dump. As far as we discovered, these are a couple of tables with <10 rows. Not sure, but it's very likely that until recently this problem didn't show up, and that makes me suspect a database or configuration change that may be behind it, tho I'm quite sure we didn't mess with 7.4's config for quite a long time. Any ideas? Checked in "Migration to version 7.4", but none of that seems to apply. TIA, G. %--- cut here ---% \end ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] Constraint->function dependency and dump in 7.3
On Tue, 2004-07-13 at 13:42, SZŰCS Gábor wrote: > Dear Rod, > > Thanks. It'll be a pain to have two versions between the prod and devel > servers, but I'll forward this info to the chief. You can make this part easier on yourself. Dump the structure from production and migrate it to devel (fix the dump file). Keep this file. >From now on when applying changes to production, keep the structural changes applied as a separate SQL file (numbers work well). When building a new box to duplicate production: 1. Apply all patches in order for i in `ls *.sql` ; do cat $i | psql test_db ; done 2. Do a data dump of production and restore that to the testing area pg_dump --data-only prod_db | psql test_db Now you don't need to worry about whether the production schema will or will not dump properly, since you can reproduce that using external files. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [SQL] Constraint->function dependency and dump in 7.3
Dear Rod, That sounds as good as simple ;) As for as patching, we do something like that. Developed a PHP script that compares schema files (not dumps, but source codes instead) to the actual. Say, it creates a temp table and compares it to the existing one, examining pg_attributes, pg_indexes, pg_constraints etc. then runs appropriate ALTER TABLE commands on the old table. Creating a clean double of the prod and keeping its schema sounds good to handle exceptional things. Yours, G. %--- cut here ---% \end - Original Message - From: "Rod Taylor" <[EMAIL PROTECTED]> Sent: Tuesday, July 13, 2004 8:03 PM > When building a new box to duplicate production: > 1. Apply all patches in order > for i in `ls *.sql` ; do cat $i | psql test_db ; done > 2. Do a data dump of production and restore that to the testing area > pg_dump --data-only prod_db | psql test_db > > Now you don't need to worry about whether the production schema will or > will not dump properly, since you can reproduce that using external > files. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[SQL] How do I convice postgres to use an index?
I apologize for the following stupid question. I have been doing some
searching and haven't found anything really helpful.
The problem is that postgres (7.4.2) keeps choosing to do a sequential
scan on a table when an index scan would be significantly faster.
The queries that I'm using look at daily statistics from events logged
by our Checkpoint firewall and generate graphs. Since they are bit
complicated, I simplified it to "select count(*) from log where
timestamp>='7/12/2004'" for testing.
The table looks like this:
Column|Type | Modifiers
--+-+---
loc | integer |
src | inet|
dst | inet|
interface| character varying(10) |
direction| character varying(8)|
proto| character varying(4)|
service | integer |
icmp_code| integer |
sport| integer |
timestamp| timestamp without time zone |
rule | character varying(8)|
message_info | text|
action | character varying(16) |
icmp_type| integer |
orig | inet|
Indexes:
"log_dst_key" btree (dst)
"log_src_key" btree (src)
"log_timestamp_key" btree ("timestamp")
To test, I started with vacuum analyze. (My table has approximately
5.8M rows.)
fw1=# select count(*) from log where timestamp>='7/12/2004';
count
246763
(1 row)
Time: 161199.955 ms
fw1=# set enable_seqscan='off';
SET
Time: 47.662 ms
fw1=# select count(*) from log where timestamp>='7/12/2004';
count
247149
(1 row)
Time: 12428.210 ms
Notice the execution time differences.
The query plan before turning enable_seqscan off looks like this:
Aggregate (cost=208963.26..208963.26 rows=1 width=0)
-> Seq Scan on log (cost=0.00..208380.89 rows=232948 width=0)
Filter: ("timestamp" >= '2004-07-12 00:00:00'::timestamp
without time zone)
Any suggestions?
Thanks,
-Vic
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[SQL] schemas
hi, can anyone tell me which is the earliest version of postgres to support schemas -- regards kg http://www.onlineindianhotels.net - hotel bookings reservations in over 4600 hotels in India http://www.ootygolfclub.org ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [SQL] schemas
On Wed, Jul 14, 2004 at 07:40:49 +0530, Kenneth Gonsalves <[EMAIL PROTECTED]> wrote: > hi, > can anyone tell me which is the earliest version of postgres to support > schemas 7.3 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
