subject:"\[HACKERS\] VLDB Features"

Re: [HACKERS] VLDB Features

2007-12-20 Thread Josh Berkus

Tom,

> Sure ... but you'll find that it's not large enough to be useful.
> Once you remove all the interesting consistency checks such as
> unique indexes and foreign keys, the COPY will tend to go through
> just fine, and then you're still stuck trying to weed out bad data
> without very good tools for it.  The only errors we could really
> separate out without subtransaction fencing are extremely trivial
> ones like too many or too few fields on a line ... which can be
> caught with a sed script.

Speaking as someone who did a LOT of DW load design only a couple years ago, 
I'll say that the "special case" of no triggers, no constraint checks except 
length, and type-safety check actually constitutes about 50% of DW bulk 
loading.  The only exception to that is unique indexes, which would normally 
be included and would be the difficult thing.

Also, "special case bulk loading" would in fact give users of other types of 
applications a lot more flexibility -- they could always load into a holding 
table just to clean up the type safety issues and then merge into the real 
table.  

So I don't agree that the "load into new partition without dependancies" is 
too much of a special case to be worth pursuing.  It might be a bad idea for 
other reasons, but not because it's too obscure.

--Josh


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] VLDB Features

2007-12-18 Thread Decibel!


On Dec 12, 2007, at 1:26 PM, Markus Schiltknecht wrote:

Josh Berkus wrote:
Sure.  Imagine you have a 5TB database on a machine with 8 cores  
and only one concurrent user.  You'd like to have 1 core doing I/ 
O, and say 4-5 cores dividing the scan and join processing into  
4-5 chunks.


Ah, right, thank for enlightenment. Heck, I'm definitely too  
focused on replication and distributed databases :-)


However, there's certainly a great deal of an intersection between  
parallel processing on different machines and parallel processing  
on multiple CPUs - especially considering NUMA architecture. *comes- 
to-think-again*...



Except that doing something in-machine is often far simpler than  
trying to go cross-machine, especially when that something is a  
background reader.


Let's walk before we run. :)
--
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED]
Give your computer some brain candy! www.distributed.net Team #1828




smime.p7s
Description: S/MIME cryptographic signature

Re: [HACKERS] VLDB Features

2007-12-18 Thread Michał Zaborowski

2007/12/16, Tom Lane <[EMAIL PROTECTED]>:
> Hannu Krosing <[EMAIL PROTECTED]> writes:
> > But can't we _define_ such a subset, where we can do a transactionless
> > load ?
>
> Sure ... but you'll find that it's not large enough to be useful.
> Once you remove all the interesting consistency checks such as
> unique indexes and foreign keys, the COPY will tend to go through
> just fine, and then you're still stuck trying to weed out bad data
> without very good tools for it.  The only errors we could really
> separate out without subtransaction fencing are extremely trivial
> ones like too many or too few fields on a line ... which can be
> caught with a sed script.
>
I have dump file. I would like to load it ASAP.
Constraints will be applied at the end, so any problem can be detected.
I would like it to be as direct as possible and as bulk as possibe - just
allocate pages and fill them with the data. Maybe it should be different
mode - single user or so. Right now I can save some IO - like turn off
fsync, but that is all :(

I got something like that:
http://www.tbray.org/ongoing/When/200x/2007/10/30/WF-Results
I have no idea how to load single file in many threads, but... the point is
that it can be much faster that single-thread load - surprisingly - at
least for me.

-- 
Regards,
  Michał Zaborowski (TeXXaS)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] VLDB Features

2007-12-16 Thread NikhilS

Hi,

On Dec 15, 2007 1:14 PM, Tom Lane <[EMAIL PROTECTED]> wrote:

> NikhilS <[EMAIL PROTECTED]> writes:
> > Any errors which occur before doing the heap_insert should not require
> > any recovery according to me.
>
> A sufficient (though far from all-encompassing) rejoinder to that is
> "triggers and CHECK constraints can do anything".
>
> > The overhead of having a subtransaction per row is a very valid concern.
> But
> > instead of using a per insert or a batch insert substraction, I am
> > thinking that we can start off a subtraction and continue it till we
> > encounter a failure.The moment an error is encountered, since we have
> the offending >(already in heap) tuple around, we can call a
> simple_heap_delete on the same and commit >(instead of aborting) this
> subtransaction
>
> What of failures that occur only at (sub)transaction commit, such as
> foreign key checks?
>

What if we identify and define a subset where we could do subtransactions
based COPY? The following could be supported:

* A subset of triggers and CHECK constraints which do not move the tuple
around. (Identifying this subset might be an issue though?)
* Primary/unique key indexes

As Hannu mentioned elsewhere in this thread, there should not be very many
instances of complex triggers/CHECKs around? And  may be in those instances
(and also the foreign key checks case), the behaviour could default to use a
per-subtransaction-per-row or even the existing single transaction model?

Regards,
Nikhils
-- 
EnterpriseDB   http://www.enterprisedb.com

Re: [HACKERS] VLDB Features

2007-12-16 Thread Tom Lane

Hannu Krosing <[EMAIL PROTECTED]> writes:
> But can't we _define_ such a subset, where we can do a transactionless
> load ?

Sure ... but you'll find that it's not large enough to be useful.
Once you remove all the interesting consistency checks such as
unique indexes and foreign keys, the COPY will tend to go through
just fine, and then you're still stuck trying to weed out bad data
without very good tools for it.  The only errors we could really
separate out without subtransaction fencing are extremely trivial
ones like too many or too few fields on a line ... which can be
caught with a sed script.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] VLDB Features

2007-12-16 Thread Hannu Krosing

Ühel kenal päeval, L, 2007-12-15 kell 01:12, kirjutas Tom Lane:
> Josh Berkus <[EMAIL PROTECTED]> writes:
> > There's no way we can do a transactionless load, then?  I'm thinking of the 
> > load-into-new-partition which is a single pass/fail operation.  Would 
> > ignoring individual row errors in for this case still cause these kinds of 
> > problems?
> 
> Given that COPY fires triggers and runs CHECK constraints, there is no
> part of the system that cannot be exercised during COPY.  So I think
> supposing that we can just deal with some simplified subset of reality
> is mere folly.

But can't we _define_ such a subset, where we can do a transactionless
load ?

I don't think that most DW/VLDB schemas fire complex triggers or custom
data-modifying functions inside CHECK's.

Then we could just run the remaining simple CHECK constraints ourselves
and not abort on non-check, but just log the rows ?

The COPY ... WITH ERRORS TO ... would essentially become a big
conditional RULE through which the incoming data is processed.

--
Hannu

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] VLDB Features

2007-12-16 Thread Trent Shipley

On Saturday 2007-12-15 02:14, Simon Riggs wrote:
> On Fri, 2007-12-14 at 18:22 -0500, Tom Lane wrote:
> > Neil Conway <[EMAIL PROTECTED]> writes:
> > > By modifying COPY: COPY IGNORE ERRORS or some such would instruct COPY
> > > to drop (and log) rows that contain malformed data. That is, rows with
> > > too many or too few columns, rows that result in constraint violations,
> > > and rows containing columns where the data type's input function raises
> > > an error. The last case is the only thing that would be a bit tricky to
> > > implement, I think: you could use PG_TRY() around the
> > > InputFunctionCall, but I guess you'd need a subtransaction to ensure
> > > that you reset your state correctly after catching an error.
> >
> > Yeah.  It's the subtransaction per row that's daunting --- not only the
> > cycles spent for that, but the ensuing limitation to 4G rows imported
> > per COPY.
>
> I'd suggest doing everything at block level
> - wrap each new block of data in a subtransaction
> - apply data to the table block by block (can still work with FSM).
> - apply indexes in bulk for each block, unique ones first.
>
> That then gives you a limit of more than 500 trillion rows, which should
> be enough for anyone.

Wouldn't it only give you more than 500T rows in the best case?  If it hits a 
bad row it has to back off and roll forward one row and one subtransaction at 
a time for the failed block.  So in the worst case, where there is at least 
one exception row per block, I think you would still wind up with only a 
capacity of 4G rows.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] VLDB Features

2007-12-15 Thread Pavel Stehule

On 16/12/2007, Neil Conway <[EMAIL PROTECTED]> wrote:
> On Tue, 2007-12-11 at 19:11 -0500, Greg Smith wrote:
> > I'm curious what you feel is missing that pgloader doesn't fill that
> > requirement:  http://pgfoundry.org/projects/pgloader/
>
> For complicated ETL, I agree that using an external tool makes the most
> sense. But I think there is still merit in adding support to COPY for
> the simple case of trying to load a data file that has some corrupted,
> invalid or duplicate records.
>
> -Neil
>
>

Any simple enhancing of COPY is welcome. I lost lot of time with
repeated imports.

Regards
Pavel


>
> ---(end of broadcast)---
> TIP 1: if posting/reading through Usenet, please send an appropriate
>subscribe-nomail command to [EMAIL PROTECTED] so that your
>message can get through to the mailing list cleanly
>

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] VLDB Features

2007-12-15 Thread Neil Conway

On Tue, 2007-12-11 at 19:11 -0500, Greg Smith wrote:
> I'm curious what you feel is missing that pgloader doesn't fill that 
> requirement:  http://pgfoundry.org/projects/pgloader/

For complicated ETL, I agree that using an external tool makes the most
sense. But I think there is still merit in adding support to COPY for
the simple case of trying to load a data file that has some corrupted,
invalid or duplicate records.

-Neil

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] VLDB Features

2007-12-15 Thread Simon Riggs

On Fri, 2007-12-14 at 18:22 -0500, Tom Lane wrote:
> Neil Conway <[EMAIL PROTECTED]> writes:
> > By modifying COPY: COPY IGNORE ERRORS or some such would instruct COPY
> > to drop (and log) rows that contain malformed data. That is, rows with
> > too many or too few columns, rows that result in constraint violations,
> > and rows containing columns where the data type's input function raises
> > an error. The last case is the only thing that would be a bit tricky to
> > implement, I think: you could use PG_TRY() around the InputFunctionCall,
> > but I guess you'd need a subtransaction to ensure that you reset your
> > state correctly after catching an error.
> 
> Yeah.  It's the subtransaction per row that's daunting --- not only the
> cycles spent for that, but the ensuing limitation to 4G rows imported
> per COPY.

I'd suggest doing everything at block level
- wrap each new block of data in a subtransaction
- apply data to the table block by block (can still work with FSM). 
- apply indexes in bulk for each block, unique ones first. 

That then gives you a limit of more than 500 trillion rows, which should
be enough for anyone.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] VLDB Features

2007-12-14 Thread Tom Lane

NikhilS <[EMAIL PROTECTED]> writes:
> Any errors which occur before doing the heap_insert should not require
> any recovery according to me.

A sufficient (though far from all-encompassing) rejoinder to that is
"triggers and CHECK constraints can do anything".

> The overhead of having a subtransaction per row is a very valid concern. But
> instead of using a per insert or a batch insert substraction, I am
> thinking that we can start off a subtraction and continue it till we
> encounter a failure.

What of failures that occur only at (sub)transaction commit, such as
foreign key checks?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] VLDB Features

2007-12-14 Thread NikhilS

Hi,

>
> Another approach would be to distinguish between errors that require a
> subtransaction to recover to a consistent state, and less serious errors
> that don't have this requirement (e.g. invalid input to a data type
> input function). If all the errors that we want to tolerate during a
> bulk load fall into the latter category, we can do without
> subtransactions.
>

I think errors which occur after we have done a fast_heap_insert of the
tuple generated from the current input row are the ones which would require
the subtransaction to recover. Examples could be unique/primary key
violation errors or FKey/triggers related errors. Any errors which occur
before doing the heap_insert should not require any recovery according to
me.

The overhead of having a subtransaction per row is a very valid concern. But
instead of using a per insert or a batch insert substraction, I am
thinking that we can start off a subtraction and continue it till we
encounter a failure. The moment an error is encountered, since we have the
offending (already in heap) tuple around, we can call a simple_heap_delete
on the same and commit (instead of aborting) this subtransaction after doing
some minor cleanup. This current input data row can also be logged into a
bad file. Recall that we need to only handle those errors in which the
simple_heap_insert is successful, but the index insertion or the after row
insert trigger causes an error. The rest of the load then can go ahead with
the start of a new subtransaction.

Regards,
Nikhils
-- 
EnterpriseDB   http://www.enterprisedb.com

Re: [HACKERS] VLDB Features

2007-12-14 Thread Tom Lane

Josh Berkus <[EMAIL PROTECTED]> writes:
> There's no way we can do a transactionless load, then?  I'm thinking of the 
> load-into-new-partition which is a single pass/fail operation.  Would 
> ignoring individual row errors in for this case still cause these kinds of 
> problems?

Given that COPY fires triggers and runs CHECK constraints, there is no
part of the system that cannot be exercised during COPY.  So I think
supposing that we can just deal with some simplified subset of reality
is mere folly.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] VLDB Features

2007-12-14 Thread Trent Shipley

On Friday 2007-12-14 16:22, Tom Lane wrote:
> Neil Conway <[EMAIL PROTECTED]> writes:
> > By modifying COPY: COPY IGNORE ERRORS or some such would instruct COPY
> > to drop (and log) rows that contain malformed data. That is, rows with
> > too many or too few columns, rows that result in constraint violations,
> > and rows containing columns where the data type's input function raises
> > an error. The last case is the only thing that would be a bit tricky to
> > implement, I think: you could use PG_TRY() around the InputFunctionCall,
> > but I guess you'd need a subtransaction to ensure that you reset your
> > state correctly after catching an error.
>
> Yeah.  It's the subtransaction per row that's daunting --- not only the
> cycles spent for that, but the ensuing limitation to 4G rows imported
> per COPY.

You could extend the COPY FROM syntax with a COMMIT EVERY n clause.  This 
would help with the 4G subtransaction limit.  The cost to the ETL process is 
that a simple rollback would not be guaranteed send the process back to it's 
initial state.  There are easy ways to deal with the rollback issue though.  

A {NO} RETRY {USING algorithm} clause might be useful.   If the NO RETRY 
option is selected then the COPY FROM can run without subtransactions and in 
excess of the 4G per transaction limit.  NO RETRY should be the default since 
it preserves the legacy behavior of COPY FROM.

You could have an EXCEPTIONS TO {filename|STDERR} clause. I would not give the 
option of sending exceptions to a table since they are presumably malformed, 
otherwise they would not be exceptions.  (Users should re-process exception 
files if they want an if good then table a else exception to table b ...)

EXCEPTIONS TO and NO RETRY would be mutually exclusive.

> If we could somehow only do a subtransaction per failure, things would
> be much better, but I don't see how.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] VLDB Features

2007-12-14 Thread Josh Berkus

Tom,

> I think such an approach is doomed to hopeless unreliability.  There is
> no concept of an error that doesn't require a transaction abort in the
> system now, and that doesn't seem to me like something that can be
> successfully bolted on after the fact.  Also, there's a lot of
> bookkeeping (eg buffer pins) that has to be cleaned up regardless of the
> exact nature of the error, and all those mechanisms are hung off
> transactions.

There's no way we can do a transactionless load, then?  I'm thinking of the 
load-into-new-partition which is a single pass/fail operation.  Would 
ignoring individual row errors in for this case still cause these kinds of 
problems?

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] VLDB Features

2007-12-14 Thread Tom Lane

Neil Conway <[EMAIL PROTECTED]> writes:
> One approach would be to essentially implement the pg_bulkloader
> approach inside the backend. That is, begin by doing a subtransaction
> for every k rows (with k = 1000, say). If you get any errors, then
> either repeat the process with k/2 until you locate the individual
> row(s) causing the trouble, or perhaps just immediately switch to k = 1.
> Fairly ugly though, and would be quite slow for data sets with a high
> proportion of erroneous data.

You could make it self-tuning, perhaps: initially, or after an error,
set k = 1, and increase k after a successful set of rows.

> Another approach would be to distinguish between errors that require a
> subtransaction to recover to a consistent state, and less serious errors
> that don't have this requirement (e.g. invalid input to a data type
> input function). If all the errors that we want to tolerate during a
> bulk load fall into the latter category, we can do without
> subtransactions.

I think such an approach is doomed to hopeless unreliability.  There is
no concept of an error that doesn't require a transaction abort in the
system now, and that doesn't seem to me like something that can be
successfully bolted on after the fact.  Also, there's a lot of
bookkeeping (eg buffer pins) that has to be cleaned up regardless of the
exact nature of the error, and all those mechanisms are hung off
transactions.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] VLDB Features

2007-12-14 Thread Neil Conway

On Fri, 2007-12-14 at 18:22 -0500, Tom Lane wrote:
> If we could somehow only do a subtransaction per failure, things would
> be much better, but I don't see how.

One approach would be to essentially implement the pg_bulkloader
approach inside the backend. That is, begin by doing a subtransaction
for every k rows (with k = 1000, say). If you get any errors, then
either repeat the process with k/2 until you locate the individual
row(s) causing the trouble, or perhaps just immediately switch to k = 1.
Fairly ugly though, and would be quite slow for data sets with a high
proportion of erroneous data.

Another approach would be to distinguish between errors that require a
subtransaction to recover to a consistent state, and less serious errors
that don't have this requirement (e.g. invalid input to a data type
input function). If all the errors that we want to tolerate during a
bulk load fall into the latter category, we can do without
subtransactions.

-Neil

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] VLDB Features

2007-12-14 Thread Tom Lane

Neil Conway <[EMAIL PROTECTED]> writes:
> By modifying COPY: COPY IGNORE ERRORS or some such would instruct COPY
> to drop (and log) rows that contain malformed data. That is, rows with
> too many or too few columns, rows that result in constraint violations,
> and rows containing columns where the data type's input function raises
> an error. The last case is the only thing that would be a bit tricky to
> implement, I think: you could use PG_TRY() around the InputFunctionCall,
> but I guess you'd need a subtransaction to ensure that you reset your
> state correctly after catching an error.

Yeah.  It's the subtransaction per row that's daunting --- not only the
cycles spent for that, but the ensuing limitation to 4G rows imported
per COPY.

If we could somehow only do a subtransaction per failure, things would
be much better, but I don't see how.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] VLDB Features

2007-12-14 Thread Andrew Dunstan




Neil Conway wrote:

On Fri, 2007-12-14 at 14:48 +0200, Hannu Krosing wrote:
  

How did you do it ?

Did you enchance COPY command or was it something completely new ?



By modifying COPY: COPY IGNORE ERRORS or some such would instruct COPY
to drop (and log) rows that contain malformed data. That is, rows with
too many or too few columns, rows that result in constraint violations,
and rows containing columns where the data type's input function raises
an error. The last case is the only thing that would be a bit tricky to
implement, I think: you could use PG_TRY() around the InputFunctionCall,
but I guess you'd need a subtransaction to ensure that you reset your
state correctly after catching an error.


  


Ideally I think you would put the failing input line in another table, 
or maybe another file. If a table, it would probably have to be as bytea.


cheers

andrew

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] VLDB Features

2007-12-14 Thread Neil Conway

On Fri, 2007-12-14 at 14:48 +0200, Hannu Krosing wrote:
> How did you do it ?
> 
> Did you enchance COPY command or was it something completely new ?

By modifying COPY: COPY IGNORE ERRORS or some such would instruct COPY
to drop (and log) rows that contain malformed data. That is, rows with
too many or too few columns, rows that result in constraint violations,
and rows containing columns where the data type's input function raises
an error. The last case is the only thing that would be a bit tricky to
implement, I think: you could use PG_TRY() around the InputFunctionCall,
but I guess you'd need a subtransaction to ensure that you reset your
state correctly after catching an error.

-Neil

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] VLDB Features

2007-12-14 Thread Hannu Krosing


Ühel kenal päeval, T, 2007-12-11 kell 15:41, kirjutas Neil Conway:
> On Tue, 2007-12-11 at 10:53 -0800, Josh Berkus wrote:
> > Just so you don't lose sight of it, one of the biggest VLDB features we're 
> > missing is fault-tolerant bulk load.
> 
> I actually had to cook up a version of this for Truviso recently. I'll
> take a look at submitting a cleaned-up implementation for 8.4.

How did you do it ?

Did you enchance COPY command or was it something completely new ?

---
Hannu



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] VLDB Features

2007-12-13 Thread Markus Schiltknecht


Hello Gregory,

Gregory Stark wrote:

Oracle is using Direct I/O so they need the reader and writer threads to avoid
blocking on i/o all the time. We count on the OS doing readahead and buffering
our writes so we don't have to. Direct I/O and needing some way to do
asynchronous writes and reads are directly tied.


Yeah, except in cases where we can tell ahead non-sequential reads. 
Which admittedly doesn't come up too frequently and can probably be 
handled with posix_fadvice - as you are currently testing.



Where Parallel query is useful is when you have queries that involve a
substantial amount of cpu resources, especially if you have a very fast I/O
system which can saturate the bandwidth to a single cpu.


Full ACK, the very same applies to parallel querying on shared-nothing 
clusters. Those can help if the bandwidth to all processing cores 
together becomes the bottleneck (and the resulting data is relatively 
small compared to the input data).


For example, Sun's UltraSparc T2 features only 8 PCIe lanes for those 8 
cores, so you end up with 250 MiB/sec per core or about 32 MiB/sec per 
thread on average. To be fair: their 10 Gig Ethernet ports don't go via 
PCIe, so you get an additional 2x 1 GiB/sec for the complete chip. And 
memory bandwidth looks a lot better: Sun claims 60+ GiB/sec, leaving 
almost 8 GiB/sec per core or 1 GiB/sec per thread.


If my calculations for Intel are correct, a Quad Xeon with a 1.33 GHz 
FSB has around 21 GiB/sec throughput to main memory, giving 5 GiB/sec 
per core. (Why are these numbers so hard to find? It looks like Intel 
deliberately obfuscates them with FSB MHz or Giga-transactions per sec 
and the like.)


Regards

Markus


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] VLDB Features

2007-12-12 Thread Gregory Stark

"Josh Berkus" <[EMAIL PROTECTED]> writes:

> Markus,
>
>> > Parallel Query
>>
>> Uh.. this only makes sense in a distributed database, no? I've thought
>> about parallel querying on top of Postgres-R. Does it make sense
>> implementing some form of parallel querying apart from the distribution
>> or replication engine?

Yes, but not for the reasons Josh describes.

> I'd say implementing a separate I/O worker would be the first step towards 
> this; if we could avoid doing I/O in the same process/thread where we're 
> doing row parsing it would speed up large scans by 100%.  I know Oracle does 
> this, and their large-table-I/O is 30-40% faster than ours despite having 
> less efficient storage.

Oracle is using Direct I/O so they need the reader and writer threads to avoid
blocking on i/o all the time. We count on the OS doing readahead and buffering
our writes so we don't have to. Direct I/O and needing some way to do
asynchronous writes and reads are directly tied.

Where Parallel query is useful is when you have queries that involve a
substantial amount of cpu resources, especially if you have a very fast I/O
system which can saturate the bandwidth to a single cpu.

So for example if you have a merge join which requires sorting both sides of
the query you could easily have subprocesses handle those sorts allowing you
to bring two processors to bear on the problem instead of being limited to a
single processor.

On Oracle Parallel Query goes great with partitioned tables. Their query
planner will almost always turn the partition scans into parallel scans and
use separate processors to scan different partitions. 

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] VLDB Features

2007-12-12 Thread Joshua D. Drake

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


> Greenplum as well as other Real Life stuff.

For those of us here who have no idea what you are talking about can
you define what "Real Life" is like?

Joshua D. Drake



- -- 
The PostgreSQL Company: Since 1997, http://www.commandprompt.com/ 
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
SELECT 'Training', 'Consulting' FROM vendor WHERE name = 'CMD'


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHYEmLATb/zqfZUUQRAhHJAJ9GD5DPZOlyd9LiBUG5TENIjuTgSwCaAnsf
5vdCZatl+XqD5S0+zMV/Ltk=
=KyqY
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] VLDB Features

2007-12-12 Thread Gavin Sherry

On Wed, Dec 12, 2007 at 08:26:16PM +0100, Markus Schiltknecht wrote:
> >>Isn't Gavin Sherry working on this? Haven't read anything from him
> >>lately...
> >
> >Me neither.  Swallowed by Greenplum and France.
> 
> Hm.. good for him, I guess!

Yes, I'm around -- just extremely busy with a big release at Greenplum as 
well as other Real Life stuff.

Thanks,

Gavin

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] VLDB Features

2007-12-12 Thread Markus Schiltknecht


Hi Josh,

Josh Berkus wrote:
Sure.  Imagine you have a 5TB database on a machine with 8 cores and only one 
concurrent user.  You'd like to have 1 core doing I/O, and say 4-5 cores 
dividing the scan and join processing into 4-5 chunks.


Ah, right, thank for enlightenment. Heck, I'm definitely too focused on 
replication and distributed databases :-)


However, there's certainly a great deal of an intersection between 
parallel processing on different machines and parallel processing on 
multiple CPUs - especially considering NUMA architecture. 
*comes-to-think-again*...



Isn't Gavin Sherry working on this? Haven't read anything from him
lately...


Me neither.  Swallowed by Greenplum and France.


Hm.. good for him, I guess!

Regards

Markus

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] VLDB Features

2007-12-12 Thread Josh Berkus

Markus,

> > Parallel Query
>
> Uh.. this only makes sense in a distributed database, no? I've thought
> about parallel querying on top of Postgres-R. Does it make sense
> implementing some form of parallel querying apart from the distribution
> or replication engine?

Sure.  Imagine you have a 5TB database on a machine with 8 cores and only one 
concurrent user.  You'd like to have 1 core doing I/O, and say 4-5 cores 
dividing the scan and join processing into 4-5 chunks.

I'd say implementing a separate I/O worker would be the first step towards 
this; if we could avoid doing I/O in the same process/thread where we're 
doing row parsing it would speed up large scans by 100%.  I know Oracle does 
this, and their large-table-I/O is 30-40% faster than ours despite having 
less efficient storage.

Maybe Greenplum or EnterpriseDB will contribute something.  ;-)

> > Windowing Functions
>
> Isn't Gavin Sherry working on this? Haven't read anything from him
> lately...

Me neither.  Swallowed by Greenplum and France.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] VLDB Features

2007-12-12 Thread Markus Schiltknecht


Hi,

Josh Berkus wrote:

Here's the other VLDB features we're missing:

Parallel Query


Uh.. this only makes sense in a distributed database, no? I've thought 
about parallel querying on top of Postgres-R. Does it make sense 
implementing some form of parallel querying apart from the distribution 
or replication engine?



Windowing Functions


Isn't Gavin Sherry working on this? Haven't read anything from him lately...

Parallel Index Build (not sure how this works exactly, but it speeds Oracle 
up considerably)


Sounds interesting *turs-away-to-google*


On-disk Bitmap Index (anyone game to finish GP patch?)


Anybody having an idea of what's missing there (besides good use cases, 
which some people doubt)? Again: Gavin?



Simon, we should start a VLDB-Postgres developer wiki page.


Thanks, Simon, wiki page looks good!

Regards

Markus


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] VLDB Features

2007-12-12 Thread Simon Riggs

On Tue, 2007-12-11 at 15:31 -0800, Josh Berkus wrote:

> Simon, we should start a VLDB-Postgres developer wiki page.

http://developer.postgresql.org/index.php/DataWarehousing

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] VLDB Features

2007-12-12 Thread Dimitri Fontaine

Hi,

Le mercredi 12 décembre 2007, Josh Berkus a écrit :
> > I'm curious what you feel is missing that pgloader doesn't fill that
> > requirement:  http://pgfoundry.org/projects/pgloader/
>
> Because pgloader is implemented in middleware, it carries a very high
> overhead if you have bad rows.  As little as 1% bad rows will slow down
> loading by 20% due to retries.

Not that much, in fact, I'd say.
pgloader allows its user to configure how large a COPY buffer to use (global 
parameter as of now, could easily be a per-section configuration knob, just 
didn't see any need for this yet).
It's the 'copy_every' parameter as seen on the man page here:
  http://pgloader.projects.postgresql.org/#toc4

pgloader will obviously prepare a in-memory buffer of copy_every tuples to 
give to COPY, and in case of error will cut it and retry. Classic dichotomy 
approach, from initial implementation by Jan Wieck.

So you can easily balance the error recovery costs against the COPY bulk size.

Note also that the overall loading time with pgloader is not scaling the same 
as the COPY buffer size, the optimal choice depends on the dataset --- and 
the data massaging pgloader has to make on it ---, and I've experienced best 
results with 1 and 15000 tuples buffers so far.

FYI, now the pgloader topic is on the table, the next items I think I'm gonna 
develop for it are configurable behavior on errors tuples (load to another 
table when pk error, e.g.), and some limited ddl-partioning support.

I'm playing with the idea for pgloader to be able to read some partitioning 
schemes (parsing CHECK constraint on inherited tables) and load directly into 
the right partitions.
That would of course be done only when configured this way, and if constraints 
are misread it would only result in a lot more rejected rows than expected, 
and you still can retry using your insert trigger instead of pgloader buggy 
smartness.

Comments welcome, regards,
-- 
dim


signature.asc
Description: This is a digitally signed message part.

Re: [HACKERS] VLDB Features

2007-12-11 Thread Josh Berkus

Greg,

> I'm curious what you feel is missing that pgloader doesn't fill that
> requirement:  http://pgfoundry.org/projects/pgloader/

Because pgloader is implemented in middleware, it carries a very high overhead 
if you have bad rows.  As little as 1% bad rows will slow down loading by 20% 
due to retries.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] VLDB Features

2007-12-11 Thread Greg Smith


On Tue, 11 Dec 2007, Josh Berkus wrote:


Just so you don't lose sight of it, one of the biggest VLDB features we're
missing is fault-tolerant bulk load.  Unfortunately, I don't know anyone
who's working on it.


I'm curious what you feel is missing that pgloader doesn't fill that 
requirement:  http://pgfoundry.org/projects/pgloader/


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] VLDB Features

2007-12-11 Thread Simon Riggs

On Tue, 2007-12-11 at 15:31 -0800, Josh Berkus wrote:

> Here's the other VLDB features we're missing:
> 
> Parallel Query
> Windowing Functions
> Parallel Index Build (not sure how this works exactly, but it speeds Oracle 
> up considerably)
> On-disk Bitmap Index (anyone game to finish GP patch?)

I would call those VLDB Data Warehousing features to differentiate
between that and the use of VLDBs for other purposes.

I'd add Materialized View support in the planner, as well as saying its
more important than parallel query, IMHO. MVs are to DW what indexes are
to OLTP. It's the same as indexes vs. seqscan; you can speed up the seq
scan or you can avoid it. Brute force is cool, but being smarter is even
better. 

The reason they don't normally show up high on anybody's feature list is
that the TPC benchmarks specifically disallow them, which as I once
observed is very good support for them being a useful feature in
practice. (Oracle originally brought out MV support as a way of
improving their TPC scores at a time when Teradata was wiping the floor
with parallel query implementation). 

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] VLDB Features

2007-12-11 Thread Neil Conway

On Tue, 2007-12-11 at 10:53 -0800, Josh Berkus wrote:
> Just so you don't lose sight of it, one of the biggest VLDB features we're 
> missing is fault-tolerant bulk load.

I actually had to cook up a version of this for Truviso recently. I'll
take a look at submitting a cleaned-up implementation for 8.4.

-Neil



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] VLDB Features

2007-12-11 Thread Josh Berkus

Hannu,

> COPY ... WITH ERRORS TO ...

Yeah, that's a start.

> or something more advanced, like bulkload which can be continued after
> crash ?

Well, we could also use a loader which automatically parallelized, but that 
functionality can be done at the middleware level.  WITH ERRORS is the 
most critical part.

Here's the other VLDB features we're missing:

Parallel Query
Windowing Functions
Parallel Index Build (not sure how this works exactly, but it speeds Oracle 
up considerably)
On-disk Bitmap Index (anyone game to finish GP patch?)

Simon, we should start a VLDB-Postgres developer wiki page.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] VLDB Features

2007-12-11 Thread Simon Riggs

On Tue, 2007-12-11 at 10:53 -0800, Josh Berkus wrote:
> Simon.
> 
> > VLDB Features I'm expecting to work on are
> > - Read Only Tables/WORM tables
> > - Advanced Partitioning
> > - Compression
> > plus related performance features
> 
> Just so you don't lose sight of it, one of the biggest VLDB features we're 
> missing is fault-tolerant bulk load.  Unfortunately, I don't know anyone 
> who's working on it.

Not lost sight of it; I have a design, but I have to prioritise also.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] VLDB Features

2007-12-11 Thread Hannu Krosing


Ühel kenal päeval, T, 2007-12-11 kell 10:53, kirjutas Josh Berkus:
> Simon.
> 
> > VLDB Features I'm expecting to work on are
> > - Read Only Tables/WORM tables
> > - Advanced Partitioning
> > - Compression
> > plus related performance features
> 
> Just so you don't lose sight of it, one of the biggest VLDB features we're 
> missing is fault-tolerant bulk load. 

What do you mean by fault-tolerant here ?

Just 

COPY ... WITH ERRORS TO ...

or something more advanced, like bulkload which can be continued after
crash ?

--
Hannu



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] VLDB Features

2007-12-11 Thread Josh Berkus

Simon.

> VLDB Features I'm expecting to work on are
> - Read Only Tables/WORM tables
> - Advanced Partitioning
> - Compression
> plus related performance features

Just so you don't lose sight of it, one of the biggest VLDB features we're 
missing is fault-tolerant bulk load.  Unfortunately, I don't know anyone 
who's working on it.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[HACKERS] VLDB Features

2007-12-11 Thread Simon Riggs

I'm starting work on next projects for 8.4.

Many applications have the need to store very large data volumes for
both archival and analysis. The analytic databases are commonly known as
Data Warehouses, though there isn't a common term for large archival
data stores. The use cases for those can often be blurred and many
people see those as only one use case. My initial interest is in the
large archival data stores.

One of the main issues to be faced is simply data maintenance and
management. Loading, deleting, vacuuming data all takes time. Those
issues relate mainly to the size of the data store rather than any
particular workload, so I'm calling that set of required features "Very
Large Database" (or VLDB) features.

VLDB Features I'm expecting to work on are
- Read Only Tables/WORM tables
- Advanced Partitioning
- Compression
plus related performance features

Details of those will be covered in separate mails over next few weeks
and months. So just to let everybody know that's where I'm headed, so
you see the big picture with me.

I'll be working on other projects as well, many of which I've listed
here: http://developer.postgresql.org/index.php/Simon_Riggs%
27_Development_Projects  I expect the list is too long to complete for
8.4, but I'm allowing for various issues arising during development.

So specific discussion on other mails as they arrive, please.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

39 matches

Mail list logo