[GENERAL] Multithreaded queue in PgSQL

2008-06-10 Thread Nikola Milutinovic
Hi all.

This may be trivial, but I cannot find good references for it. The problem is 
this:

Suppose we have one table in PgSQL which is a job queue, each row represents 
one job with several status flags, IDs,... Several processes will attempt to 
access the queue and take their batch of jobs, the batch will have some 
parameterizable size. So, the simple idea is select N lowest IDs that do not 
have a flag in process set and set the flag, then proceed with whatever it 
is that should be done.

Trouble is, with MVCC I don't see a way to prevent overlapping and race 
conditions. Oh, sure, if I issue select for update, it will lock rows, but, if 
I understand correctly, the change may not be instantaneous and atomic, so I 
might get transaction to roll back and then there is error handling that will 
lead to the uglies serialization I can think of. Let me clarify this, so 
somebody can tell me if I got it wrong.

Imagine Queue table with 20 rows, ID: 1,...,20, status=new. Imagine 2 
processes/threads (P1, P2) attempting to get 10 jobs each.How to do that?

P1: UPDATE job_queue SET process_id=$1, status=in process WHERE id IN (
SELECT id FROM job_queue WHERE status=new and id IN (   
SELECT id FROM job_queue WHERE status=new ORDER BY id LIMIT 10 FOR 
UPDATE)
)
)
P2: the same
P1: SELECT * FROM job_queue WHERE process_id=$1 
P2: SELECT * FROM job_queue WHERE process_id=$1 

The reason for the 2 selects is that if 2 or more processes content for the 
same set of jobs, the first one will set the status. The second will, after P1 
has released the rows get those rows, that are already taken. Of course, this 
will most likely return 0 rows for P2, since all 10 will be taken. If I leave 
out the LIMIT 10 in the inner select, I am effectively locking the entire 
table. Is that the way to go?

LOCK TABLE job_queue EXCLUSIVE;
UPDATE ...
UNLOCK TABLE job_queue;

Nix.



  

Re: [GENERAL] Multithreaded queue in PgSQL

2008-06-10 Thread Nikola Milutinovic
 You may find that the PGQ component of skytools is what you want:
  http://pgfoundry.org/projects/skytools
  http://skytools.projects.postgresql.org/doc/
  http://skytools.projects.postgresql.org/doc/pgq-sql.html


Thanks, we will look into it. Still, I am surprised to learn that SQL as such 
cannot handle it. I do realize that the question is not trivial. Would setting 
transaction isolation level to SERIALIZABLE help in any way? Would locking of 
the entire table help in any way?

Nix.



  

Re: [GENERAL] New MS patent: sounds like PG db rules

2008-05-29 Thread Nikola Milutinovic
Still, this sounds dangerous. It should be, even legally, WRONG to patent 
something that already exist and was not invented by the patentee. I know we 
can laugh off MS in court, but what about new DBs or project even built on PG 
that have this functionality? Software patents are a menace, I'm afraid. And 
this is still just one portion. IBM is also into this line of work.

Nix.


- Original Message 
From: Dave Page [EMAIL PROTECTED]
To: Justin Clift [EMAIL PROTECTED]
Cc: Jonathan Bond-Caron [EMAIL PROTECTED]; A. Kretschmer [EMAIL PROTECTED]; 
pgsql-general@postgresql.org
Sent: Tuesday, May 27, 2008 3:18:31 PM
Subject: Re: [GENERAL] New MS patent: sounds like PG db rules

HI Justin

On Tue, May 27, 2008 at 2:06 PM, Justin Clift [EMAIL PROTECTED] wrote:

 I'm trying to point out that - PG is a database system - and MS may have
 just been granted a patent for a fundamental part of it.

 Thinking it might need looking in to, and trying to bring it to the
 attention of some that can (or even cares?). ;

I don't think it's a major issue. Even if MS do think we infringe on
the patent it would be laughable for them to try to do anything about
it given that our rules implementation has provably existed in a
leading FOSS project for a decade or more.

-- 
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



  

Re: [GENERAL] slow speeds after 2 million rows inserted

2006-12-31 Thread Nikola Milutinovic
 1. There is no difference (speed-wise) between committing every 1K or every 
 250K rows.

It was really some time ago, since I have experimented with this. My las 
experiment was on PG 7.2 or 7.3. I was inserting cca 800,000 rows. Inserting 
without transactions took 25 hrs. Inserting with 10,000 rows per transaction 
took about 2.5 hrs. So, the speedup was 10x. I have not experimented with the 
transaction batch size, but I suspect that 1,000 would not show much speedup.

 2. Vacuuming also makes no difference for a heavy insert-only table, only 
 slows it down.

Makes sense. Since my application was dumping all records each month and 
inserting new ones, vacuum was really needed, but no speedup.

 3. Table size plays no real factor.




The reason I saw speedup, must have to do with the fact that without 
transactions, each insert was it's own transaction. That was eating resources.

Nix.



__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

Re: [GENERAL] slow speeds after 2 million rows inserted

2006-12-29 Thread Nikola Milutinovic
 The fastest way will be copy.
 The second fastest will be multi value inserts in batches.. eg.;
 
 INSERT INTO data_archive values () () () (I don't knwo what the max is)

 but commit every 1000 inserts or so.

Is this some empirical value? Can someone give heuristics as to how to 
calculate the optimal number of transactions after which to commit? Or at least 
guidelines.

Nix.




__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] How to secure PostgreSQL Data for distribute?

2005-08-18 Thread Nikola Milutinovic

Premsun Choltanwanich wrote:


Dear All,
 
 I need to distribute my application that use PostgreSQL as 
database to my customer. But I still have some questions in my mind on 
database security. I understand that everybody  who get my application 
database will be have a full control permission on my database in case 
that PostgreSQL already installed on their computer and they are an 
administrator on PostgreSQL. So that mean data, structure and any 
ideas contain in database will does not secure on this point. Is my 
understanding correct?
 
 What is the good way to make it all secure? Please advise.



If you want to keep your data/schema secure, you have to keep it at 
your own place and that is just the first step in making it secure. 
Espionage, and that is what basically bothers you, is a complex field of 
enterprize and there is no one size fits them all.


Your first step is keeping the DB at your site and letting your 
customers connect to it. This may present a huge problem, but there is 
no other way. No encryption would work, trust me.


Nix.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[GENERAL] PostgreSQL slow after VACUUM

2004-11-25 Thread Nikola Milutinovic
Hi all.
I have the following:
- Mandrake Linux 9.1
- PostgreSQL 7.3.2 MDK5
There is one DB and one DB user. The DB is cleared and loaded with the 
data of same volume each month (monthly report). The volume is not small 
and it usually takes 3 hours to load. Loading is done with SQL files 
which use transactions, 10,000 SQL statements per transaction.

A couple of days ago, disk became full, since we were not doing VACUUM 
on the DB at all. So, I deleted all records from the 3 tables the DB has 
and performed VACUUM FULL ANALYZE. This reclaimed the space.

My problem is that the load is now taking (to my estimate) 20 times more!
Anything I could do to find out what's going on? There is nothing in the 
logs that I can see.

Nix.
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


[GENERAL] PgSQL 8.0.0 - contributed: problem compiling

2004-08-15 Thread Nikola Milutinovic
Hi all.
I've managed to get 8.0.0 up and running. Now I'm compiling some 
contributed software and I hit a show stopper. Contrib - DBSize

Papa-Legba:/home/r/root/5.1/postgresql-8.0.0beta1/contrib/dbsize# gmake 
; gmake install
sed 's,MODULE_PATHNAME,$libdir/dbsize,g' dbsize.sql.in dbsize.sql
cc -std -ieee -pthread -O3 -c99  -I. -I../../src/include -pthread   -c 
-o dbsize.o dbsize.c
cc: Error: dbsize.c, line 58: In this statement, GetDatabasePath 
expects 2 arguments, but 1 are supplied. (toofewargs)
   dbpath = GetDatabasePath(dbid);
-^
gmake: *** [dbsize.o] Error 1
cc -std -ieee -pthread -O3 -c99  -I. -I../../src/include -pthread   -c 
-o dbsize.o dbsize.c
cc: Error: dbsize.c, line 58: In this statement, GetDatabasePath 
expects 2 arguments, but 1 are supplied. (toofewargs)
   dbpath = GetDatabasePath(dbid);
-^
gmake: *** [dbsize.o] Error 1

Inspecting the ./src/include/catalog/catalog.h I can see that it takes 
two arguments:

extern char *GetDatabasePath(Oid dbNode, Oid spcNode);
The problematic code segment is:
Datum database_size(PG_FUNCTION_ARGS)
{
   Namedbname = PG_GETARG_NAME(0);
   Oid dbid;
   char   *dbpath;
   DIR*dirdesc;
   struct dirent *direntry;
   int64   totalsize;
   dbid = get_database_oid(NameStr(*dbname));
   if (!OidIsValid(dbid))
   ereport(ERROR, (errcode(ERRCODE_UNDEFINED_DATABASE),
   errmsg(database \%s\ does not exist, 
NameStr(*dbname;

   dbpath = GetDatabasePath(dbid);
What should be the change in this code segment? libPQ docs are not 
informative...

Nix.
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


[GENERAL] PgSQL 8.0.0 beta1 compile problem + patch

2004-08-12 Thread Nikola Milutinovic
Hi all.
Ran into a minor showstopper. Sometimes, my CC is just too much of a 
nitpick, but on this matter, I agree with it. ./src/timezone/zic.c has 
a mismatch in declaration and definition of memcheck(...) function. My 
CC beltched on it (I was running it in strict ANSI mode). The diference 
is this:

static char *memcheck(char *tocheck);
...
/*
 * Memory allocation.
 */
static char *
memcheck(ptr)
char   *const ptr;
{
if (ptr == NULL)
{
const char *e = strerror(errno);
(void) fprintf(stderr, _(%s: Memory exhausted: %s\n),
   progname, e);
(void) exit(EXIT_FAILURE);
}
return ptr;
}
Notice char *const ptr. Trusting that the lower definition is correct, 
being more strict and closer to the actual code, I have changed the uper 
declaration to match. I have attached a patch.

Nix.
*** src/timezone/zic.c.orig Thu Aug 12 12:44:44 2004
--- src/timezone/zic.c  Thu Aug 12 12:45:05 2004
***
*** 134,140 
  static intitsabbr(const char *abbr, const char *word);
  static intitsdir(const char *name);
  static intlowerit(int c);
! static char *memcheck(char *tocheck);
  static intmkdirs(char *filename);
  static void newabbr(const char *abbr);
  static long oadd(long t1, long t2);
--- 134,140 
  static intitsabbr(const char *abbr, const char *word);
  static intitsdir(const char *name);
  static intlowerit(int c);
! static char *memcheck(char *const tocheck);
  static intmkdirs(char *filename);
  static void newabbr(const char *abbr);
  static long oadd(long t1, long t2);

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [GENERAL] PgSQL 7.4.2 - NaN on Tru64 UNIX - CORRECTION!!!

2004-05-19 Thread Nikola Milutinovic
Tom Lane wrote:
Nikola Milutinovic [EMAIL PROTECTED] writes:
[ about NaN on Tru64 ]
This compiles on Tru64 4.0D (the compiler swallows it), but fails on 
Tru64 UNIX 5.1B. Both basic CC and DTK Compaq CC break on that file 
complaining on that constant evaluation. The best way to solve it is to 
use system definition of Infinity Constants:
...
+ #define NAN   DBL_INFINITY

Current CVS tip will probably fail with this, because we expect the
platform to distinguish between NaN and Infinity.  Could you retry your
experiments with a recent snapshot and let us know what seems best now?
Appologies to all, I have been blind!
Disregard my proposed patches. One of users on Tru64 mailing list 
pointed me to a correct solution - add -ieee to CC flags. I took a 
look at templates for OSF1 in PgSQL and they read:

CFLAGS=-O -ieee
Since I was putting my own definition for CFLAGS, I left out -ieee. I 
have reverted ./src/include/port/osf1.h to the the original and changed 
CFLAGS in ./src/Makefile.global to include -ieee. Then I rebuild 
./src/backend/utils/adt/float.o (I'll rebuild the whole PostgreSQL in a 
moment) and ran regression tests.

All 93 tests passed.
So, again, my apologies to all, no change neccesary.
Nix.
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] PgSQL 7.4.2 - NaN on Tru64 UNIX

2004-05-17 Thread Nikola Milutinovic
Nikola Milutinovic wrote:
+ #define NAN   DBL_INFINITY
The compilation has went smoothly, I'll try to run regression tests. 
Will let you know if something fails.
Hate to reply to myself, but here goes.
With NAN defined as DBL_INFINITY I get 3 failed regression test, most 
notably float8 test.

With NAN defined as DBL_QNAN (double float Quiet NaN) I get just one, 
float8.

With NAN defined as DBL_SNAN (double float Signalling NaN) I get one 
error, float8.

The problem is the same in all three cases, an expression that is 
supposed to cause overflow, does so, but the output error text is 
slightly different. Here is the DIFF file:

*** ./expected/float8.out   Thu Sep 25 08:58:06 2003
--- ./results/float8.outMon May 17 08:37:51 2004
***
*** 247,253 
 SET f1 = FLOAT8_TBL.f1 * '-1'
 WHERE FLOAT8_TBL.f1  '0.0';
  SELECT '' AS bad, f.f1 * '1e200' from FLOAT8_TBL f;
! ERROR:  type double precision value out of range: overflow
  SELECT '' AS bad, f.f1 ^ '1e200' from FLOAT8_TBL f;
  ERROR:  result is out of range
  SELECT '' AS bad, ln(f.f1) from FLOAT8_TBL f where f.f1 = '0.0' ;
--- 247,254 
 SET f1 = FLOAT8_TBL.f1 * '-1'
 WHERE FLOAT8_TBL.f1  '0.0';
  SELECT '' AS bad, f.f1 * '1e200' from FLOAT8_TBL f;
! ERROR:  floating-point exception
! DETAIL:  An invalid floating-point operation was signaled. This 
probably means an out-of-range result or an invalid operation, suc
h as division by zero.
  SELECT '' AS bad, f.f1 ^ '1e200' from FLOAT8_TBL f;
  ERROR:  result is out of range
  SELECT '' AS bad, ln(f.f1) from FLOAT8_TBL f where f.f1 = '0.0' ;
***
*** 270,276 

  -- test for over- and underflow
  INSERT INTO FLOAT8_TBL(f1) VALUES ('10e400');
! ERROR:  10e400 is out of range for type double precision
  INSERT INTO FLOAT8_TBL(f1) VALUES ('-10e400');
  ERROR:  -10e400 is out of range for type double precision
  INSERT INTO FLOAT8_TBL(f1) VALUES ('10e-400');
--- 271,277 
  -- test for over- and underflow
  INSERT INTO FLOAT8_TBL(f1) VALUES ('10e400');
! ERROR:  invalid input syntax for type double precision: 10e400
  INSERT INTO FLOAT8_TBL(f1) VALUES ('-10e400');
  ERROR:  -10e400 is out of range for type double precision
  INSERT INTO FLOAT8_TBL(f1) VALUES ('10e-400');
==
Any comments?
Nix.
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[GENERAL] BUG: compiling PL/Python - Semi-SOLVED

2004-01-02 Thread Nikola Milutinovic



Hi all.

OS: Tru64 UNIX 4.0d
PG: PostgreSQL v7.4.1
PY: Python v2.3.3

I just ran into a minor bug while compiling 
PL/Python module. It bombed out on lines153 and 160 of 
./src/pl/plpython/plpython.c complaining on incomplete type 
"PyObject_HEAD".

The source of the problem were these two 
typedefs:

typedef struct 
PLyPlanObject{ 
PyObject_HEAD; 
void 
*plan; 
/* return of an SPI_saveplan */ 
int 
nargs; 
Oid 
*types; 
Datum 
*values; PLyTypeInfo 
*args;} PLyPlanObject;

typedef struct 
PLyResultObject{ 
PyObject_HEAD; /* HeapTuple 
*tuples; */ PyObject 
*nrows; 
/* number of rows returned by query 
*/ PyObject 
*rows; 
/* data rows, or None if no data returned 
*/ PyObject 
*status; 
/* query status, SPI_OK_*, or SPI_ERR_* 
*/} PLyResultObject;
The problem is in ";" following "PyObject_HEAD". 
"PyObject_HEAD" is a macro defined as:

#define 
PyObject_HEAD 
\ 
_PyObject_HEAD_EXTRA 
\ int 
ob_refcnt; 
\ struct _typeobject 
*ob_type;
So, that in turn expanded to this:

typedef struct PLyResultObject{
 int 
ob_refcnt ; struct _typeobject * ob_type ; ;
 
PyObject *nrows; 
PyObject *rows; 
PyObject *status;} 
PLyResultObject;

The problem is in that extra semicolon - it is an 
excess and when removed everything compiles OK. Using this in a regular local 
variable declaration will not cause problems, since an extra ";" will just be 
swallowed as an NoOp.

Nix.


[GENERAL] OT: HEADS-UP: viral storm out there

2003-09-19 Thread Nikola Milutinovic



Hi all.

This is off topic and is a cross-post, so I'll be 
brief. There is a very nasty virus out there and I urge everybody to get their 
AV in order. The virus is known as: "W32.Gibe-F" 
or "W32.Swen-A".

Yesterday, I gotcca. 200 viral 
messages.

Today, it's about 800 viral messages!

I suspect that a lot of viral traffic directed to 
me is coming from users on one of the lists I'm crossposting to. Check 
yourselves.

Nix.


Re: [GENERAL] Dreamweaver

2003-08-16 Thread Nikola Milutinovic
 This may have been better posted in the advocacy list but I am not a
 subscriber.

 We recently purchased Dreamweaver MX and I was a little surprised to
 find that one of its pre-defined scripting systems is PHP + MySQL. I
 haven't done much exploring of what is actually offered but wondered
 whether any one had tried to convince Macromedia to provide a PHP +
 PostgreSQL or at least a PHP + anydb.

PHP has a shortcoming of NOT having a unified interface to any DB. MySQL
uses one set of function calls, while PgSQL another. Yes, of course they
could add support for another (very popular) DB. But I'd rather feel out
their support for JSP/JDBC, which should be in :-)

 If the support for MySQL is in the form of an Extension, is there a
 PostgreSQL equivalent as I have been contacted by one of our clients who
 has said, Do we povide MySQL ?, we dont. Oh! I am teaching myself PHP
 + MySQL using Dreamweaver and I wanted you to host the site :-(

The only unified interface offered by PHP, that I know of, is ODBC. I know
there are build options to fuse some Java support in, but that is, IMHO, a
waste of time and effort - not to mention resources.

Nix.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[GENERAL] PostgreSQL problem with functions

2001-06-12 Thread Nikola Milutinovic

Hi all.

Maybe this is not such a novice question, but I'm having problem subscribin to some 
more professional PG lists. Here goes...

I'm trying to make a function in PostgreSQL v7.0. Right now, I'm bugging
with PL/PgSQL and SQL functions.

What I want to achieve is: insert new row in a table with a possibility of
concurent use. Concurent use means that several processes (Apache PHP4)
can call this function simultaneously.

The logical steps, as I see it, are:

1. TRANSACTION start
2. LOCK table
3. GET max(id)+1
4. INSERT new row with primary key from step 2
5. TRANSACTION commit

For this I would like the functionality of PL/PgSQL. I would like it to
return the new_id of the inserted row.

This is what I had in mind.



CREATE FUNCTION start_session_pl( int4, VARCHAR(40), VARCHAR(50) ) RETURNS
int4 AS '
DECLARE
  a_id  ALIAS FOR $1;
  a_ss  ALIAS FOR $2;
  a_ip  ALIAS FOR $3;
  curr_time datetime;
  new_idint4;
BEGIN
  curr_time := ''now'';
  IF (SELECT id FROM a_user WHERE id=a_id) ISNULL THEN
RAISE EXCEPTION ''No such ID in admins'';
  END IF;
  BEGIN TRANSACTION;
LOCK TABLE admin_session IN EXCLUSIVE MODE;
new_id := (SELECT max(id)+1 FROM admin_session);
IF new_id ISNULL THEN
  new_id := 1;
END IF;
INSERT INTO admin_session VALUES (new_id, a_ss, curr_time, a_id, a_ip);
  COMMIT TRANSACTION;
  RETURN new_id;
END;
' LANGUAGE 'plpgsql';


PROBLEM 1
--

According to docs, PL/PgSQL has no support for transactions! And, yes it
beltches on any BEGIN TRANSACTION or any such.

However, it doesn't complain on LOCK TABLE. Am I locking it or not? And
what is the lifetime of that lock?

OK, so I though lets write a wrapper function in ordinary SQL, lock table
and call the real function.


CREATE FUNCTION start_session( int4, VARCHAR(40), VARCHAR(50) ) RETURNS int4
AS '
  BEGIN TRANSACTION;
  LOCK TABLE admin_session IN EXCLUSIVE MODE;
  SELECT start_session_pl( $1, $2, $3 );
  COMMIT TRANSACTION;
' LANGUAGE 'sql';


PROBLEM 2
--

I'm having problems creating this SQL function. PSQL complains that the
return type is mismatch. More precisely:

ERROR:  return type mismatch in function decl: final query is a catalog
utility

When I put SELECT 1; at the end, the function can be created. So, a more
general SELECT is treated as a catalog utility, while a SELECT with a
determined type is treated as that type. I have tried explicit conversion to
int4, but no go.

What can I do?

Nix.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster