Re: [HACKERS] PostgreSQL mission statement?

2002-05-03 Thread Ron Chmara

Jim Mercer wrote: 
 On Thu, May 02, 2002 at 09:45:45PM -0400, mlw wrote:
  Jim Mercer wrote:
   On Thu, May 02, 2002 at 09:14:03PM -0400, mlw wrote:
Jim Mercer wrote:
 On Thu, May 02, 2002 at 08:41:30PM -0400, mlw wrote:
  A mission statement is like a tie.
 who on the list wears ties?
How many people who make IT decisions wear ties?
   too many.
  I'm sorry I started this thread.
 when i hear mission statement and quality circle and internal customer,
 i cringe.
 if the corporate management doesn't want to buy into the Open Source concept,
 fuck 'em.

trench warfare snippage

Let's see... open source philosophy applied *into* corporate-speak should
be doable
1. If you have an itch, scratch it.
2. If you want to know what's going on, use the source, luke!
3. More eyeballs = less bugs.
4. Software should be free (insert debates on speech, beer, use, licence
XYZ vs. ABC, etc., I'm not going to bother).

Hm...firing up my geekspeak-corporate BS translator. :-)

How about:
PostgreSQL creates a dynamic environment to ensure that all customers
can effectly create highly customized solutions specific to their needs. We
share and collaborate on both problems and solutions by making all
information about our products available. By using this open and exciting
environment, we increase the amount of successful software releases
using advanced concepts of peer review and peer enhancement. We ensure our
ongoing enhancement and improvement through our community, because
our customers are also our creators.

Now, if you'll excuse me, I have to go wash my mouth out with soap.

-Ronabop

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] beta testing version

2000-12-02 Thread Ron Chmara

Thomas Lockhart wrote:
 
  PostgreSQL, Inc perhaps has that as a game plan.
  I'm not so much concerned about exactly what PG, Inc is planning to offer
  as a proprietary piece - I'm purist enough that I worry about what this
  signals for their future direction.
 Hmm. What has kept replication from happening in the past? It is a big
 job and difficult to do correctly.

Well, this has nothing whatsoever to do with open or closed source. Linux
and FreeBSD are much larger, much harder to do correctly, as they are supersets
of thousands of open source projects. Complexity is not relative to licensing.

  If PG, Inc starts doing proprietary chunks, and Great Bridge remains 100%
  dedicated to Open Source, I know who I'll want to succeed and prosper.
 Let me be clear: PostgreSQL Inc. is owned and controlled by people who
 have lived the Open Source philosophy, which is not typical of most
 companies in business today.

That's one of the reasons why it's worked... open source meant open
contribution, open collaboration, open bug fixing. The price of admission
was doing your own installs, service, support, and giving something back

PG, I assume, is pretty much the same as most open source projects, massive
amounts of contribution shepherded by one or two individuals.

 We are eager to show how this can be done
 on a full time basis, not only as an avocation. And we are eager to do
 this as part of the community we have helped to build.
 As soon as you find a business model which does not require income, let
 me know. The .com'ers are trying it at the moment, and there seems to be
 a few flaws... ;)

Well, whether or not a product is open, or closed, has very little
to do with commercial success. Heck, the entire IBM PC spec was open, and
that certainly didn't hurt Dell, Compaq, etc the genie coming out
of the bottle _only_ hurt IBM. In this case, however, the genie's been
out for quite a while

BUT:
People don't buy a product because it's open, they buy it because it offers
significant value above and beyond what they can do *without* paying for
a product. Linus didn't start a new kernel out of some idealistic mantra
of freeing the world, he was broke and wanted a *nix-y OS. Years later,
the product has grown massively. Those who are profiting off of it are
unrelated to the code, to most of the developers why is this?

As it is, any company trying to make a closed version of an open source
product has some _massive_ work to do. Manuals. Documentation. Sales.
Branding. Phone support lines. Legal departments/Lawsuit prevention. Figuring
out how to prevent open source from stealing the thunder by duplicating
features. And building a _product_.

Most Open Source projects are not products, they are merely code, and some
horrid documentation, and maybe some support. The companies making money
are not making better code, they are making better _products_

And I really havn't seen much in the way of full featured products, complete
with printed docs, 24 hour support, tutorials, wizards, templates, a company
to sue if the code causes damage, GUI install, setup, removal, etc. etc. etc.

Want to make money from open source? Well, you have to find, or build,
a _product_. Right now, there are no OS db products that can compare to oh,
an Oracle product, a MSSQL product. There may be superior code, but that
doesn't make a difference in business. Business has very little to do
with building the perfect mousetrap, if nobody can easily use it.

-Bop
--
Brought to you from boop!, the dual boot Linux/Win95 Compaq Presario 1625
laptop, currently running RedHat 6.1. Your bopping may vary.



Re: [HACKERS] Re: [NOVICE] Re: re : PHP and persistent connections

2000-11-26 Thread Ron Chmara

Don Baccus wrote:
 At 12:07 AM 11/26/00 -0500, Alain Toussaint wrote:
 how about having a middle man between apache (or aolserver or any other
 clients...) and PosgreSQL ??
 that middleman could be configured to have 16 persistant connections,every
 clients would deal with the middleman instead of going direct to the
 database,this would be an advantage where multiple PostgreSQL server are
 used...
 Well, this is sort of what AOLserver does for you without any need for
 middlemen.

What if you have a server farm of 8 AOL servers, and 12 perl clients, and
3 MS Access connections, leaving things open? Is AOLserver parsing the
Perl DBD/DBI, connects, too? So you're using AOLserver as (cough) a
middleman? g

 Again, reading stuff like this makes me think "ugh!"
 This stuff is really pretty easy, it's amazing to me that the Apache/db
 world talks about such kludges when they're clearly not necessary.

How does AOL server time out access clients, ODBC connections, Perl
clients? I thought it was mainly web-server stuff.

Apache/PHP isn't the only problem. The problem isn't solved by
telling others to fix their software, either... is this something
that can be done _within_ postmaster?

-Bop

--
Brought to you from iBop the iMac, a MacOS, Win95, Win98, LinuxPPC machine,
which is currently in MacOS land.  Your bopping may vary.



[HACKERS] Re: [NOVICE] Re: re : PHP and persistent connections

2000-11-25 Thread Ron Chmara

Note: CC'd to Hackers, as this has wandered into deeper feature issues.

Tom Lane wrote:
 GH [EMAIL PROTECTED] writes:
  Do the "persistent-connected" Postgres backends ever timeout or die?
 No.  A backend will sit patiently for the client to send it another
 query or close the connection.

This does have an unfortunate denial-of-service implication, where
an attack can effectively suck up all available backends, and there's
no throttle, no timeout, no way of automatically dropping these

However, the more likely possibility is similar to the problem that
we see in PHP's persistant connections a normally benign connection
is inactive, and yet it isn't dropped. If you have two of these created
every day, and you only have 16 backends, after 8 days you have a lockout.

On a busy web site or another busy application, you can, of course,
exhaust 64 backends in a matter of minutes.

  Is it possible to set something like a timeout for persistent connctions?
  (Er, would that be something that someone would want
to do? A Bad Thing?)
 This has been suggested before, but I don't think any of the core
 developers consider it a good idea.  Having the backend arbitrarily
 disconnect on an active client would be a Bad Thing for sure.

Right but I don't think anybody has suggested disconnecting an *active*
client, just inactive ones.

 Hence,
 any workable timeout would have to be quite large (order of an
 hour, maybe? not milliseconds anyway). 

The mySQL disconnect starts at around 24 hours. It prevents a slow
accumulation of unused backends, but does nothing for a rapid
accumulation. It can be cranked down to a few minutes AFAIK.

 And that means that it's not
 an effective solution for the problem.  Under load, a webserver that
 wastes backend connections will run out of available backends long
 before a safe timeout would start to clean up after it.

Depends on how it's set up... you see, this isn't uncharted territory,
other web/db solutions have already fought with this issue. Much
like the number of backends set up for pgsql must be static, a timeout
may wind up being the same way. The critical thing to realize is
that you are timing out _inactive_ connections, not connections
in general. So provided that a connection provided information
about when it was last used, or usage set a counter somewhere, it
could easily be checked.

 To my mind, a client app that wants to use persistent connections
 has got to implement some form of connection pooling, so that it
 recycles idle connections back to a "pool" for allocation to task
 threads that want to make a new query.  And the threads have to release
 connections back to the pool as soon as they're done with a transaction.
 Actively releasing an idle connection is essential, rather than
 depending on a timeout.
 
 I haven't studied PHP at all, but from this conversation I gather that
 it's only halfway there...

Well.. This is exactly how apache and PHP serve pages. The
problem is that apache children aren't threads, they are separate copies
of the application itself. So a single apache thread will re-use the
same connection, over and over again, and give that conection over to
other connections on that apache thread.. so in your above model, it's
not really one client application in the first place.

It's a dynamic number of client applications, between one and hundreds
or so.

So to turn the feature request the other way 'round:
"I have all sorts of client apps, connecting in different ways, to
my server. Some of the clients are leaving their connections open,
but unused. How can I prevent running out of backends, and boot
the inactive users off?"

-Ronabop

--
Brought to you from iBop the iMac, a MacOS, Win95, Win98, LinuxPPC machine,
which is currently in MacOS land.  Your bopping may vary.



Re: [HACKERS] How to get around LIKE inefficiencies?

2000-11-05 Thread Ron Chmara

The Hermit Hacker wrote:
 I'm tryin to figure out how to speed up udmsearch when run under
 postgresql, and am being hit by atrocious performance when using a LIKE
 query ... the query looks like:
 SELECT ndict.url_id,ndict.intag
   FROM ndict,url
  WHERE ndict.word_id=1971739852
AND url.rec_id=ndict.url_id
AND (url.url LIKE 'http://www.postgresql.org/%');
 Take off the AND ( LIKE ) part of the query, finishes almost as soon as
 you hit return.  Put it back in, and you can go for coffee before it
 finishes ...

The entire *approach* is wrong. I'm currently in the process of optimizing
a db which is used for logfile mining, and it was originally built with the same
kludge it seems to make sense when there's only a few thousand records,
but at 20 million records, yikes!

The problem is that there's a "like" operation for something that is
fundamentally static (http://www.postgresql.org/) with some varying
data *after it*, that you're not using, in any form, for this operation.
This can be solved one of two ways:

1. Preprocess your files to strip out the paths and arguments on
a new field for the domain call. You are only setting up that data once,
so you shouldn't be using a "like" operator for every query. It's not
like on monday the server is "http://www.postgresql.org/1221" and on
tuesday the server is "http://www.postgresql.org/12111". It's always
the *same server*, so split out that data into it's own column, it's own
index.

This turns your query into:
SELECT ndict.url_id,ndict.intag
   FROM ndict,url
  WHERE ndict.word_id=1971739852
AND url.rec_id=ndict.url_id
AND url.server_url='http://www.postgresql.org/';

2. Trigger to do the above, if you're doing on-the-fly inserts into
your db (so you can't pre-process).

-Ronabop

--
Brought to you from iBop the iMac, a MacOS, Win95, Win98, LinuxPPC machine,
which is currently in MacOS land.  Your bopping may vary.