Bruce,
>
> I have no idea what you are talking about. Again, give me facts about
> what we currently don't do and what you want to do.
Currently:
- No statement of multi-byte control character format
- No tests to define or prove "works flawlessly" or identify when something
breaks the current
Luke Lonergan wrote:
> Bruce,
>
> > We have two and three-byte encodings, so 16-bit seems like it wouldn't
> > work. I am not aware of any specs except the C code itself.
>
> Ok - no problem.
>
> How about test data and cases? I see the SQL encoding examples used in
> src/test/regress/sql for
Bruce,
> We have two and three-byte encodings, so 16-bit seems like it wouldn't
> work. I am not aware of any specs except the C code itself.
Ok - no problem.
How about test data and cases? I see the SQL encoding examples used in
src/test/regress/sql for testing encoding in SQL, but are there
Luke Lonergan wrote:
> Bruce,
>
> Is there a good source of multi-byte copy data test cases? What is
> currently done to test the trans-coding support? (where client and server
> encodings are different)
>
> I notice that the regression data in the CVS version of postgres does not
> seem to incl
Bruce,
Is there a good source of multi-byte copy data test cases? What is
currently done to test the trans-coding support? (where client and server
encodings are different)
I notice that the regression data in the CVS version of postgres does not
seem to include cases other than the ASCII data,
Alon Goldshuv wrote:
> Bruce,
>
> Point taken.
>
> > Now, you can argue that
> > a different escape should be possible, or that some other escape syntax
> > could be used, but the existing mechanism is clearly 100% reliable when
> > used properly and not broken.
>
> I think that having an optio
Bruce,
Point taken.
> Now, you can argue that
> a different escape should be possible, or that some other escape syntax
> could be used, but the existing mechanism is clearly 100% reliable when
> used properly and not broken.
I think that having an option for another escape syntax (such as usin
Alon Goldshuv wrote:
> >I've been following this thread, and I'm a little confused. Could you
> possibly clarify what you mean, by providing a couple of lines of input
> as it would be formatted with escape processing turned off - containing
> a text field with an embedded newline and tab and a nul
"Luke Lonergan" <[EMAIL PROTECTED]> writes:
> In the data warehousing industry, data conversion and manipulation is
> normally kept distinct from data loading.
It's a bit strange to call this conversion or manipulation. One way or another
you have to escape whatever your delimiters are. How wo
>> 2) A modified command syntax for introducing a direct single row error
>> handling. By direct I mean - a row that if rejected from within the COPY
>> command context does not throw an error and rollsback the whole transaction.
>> Instead the error is caught and recorded elsewhere, maybe in some
Steve,
> I can only think of one where it's common. Windows filenames.
Nearly all weblog data then.
> But if
> you're going to support arbitrary data in a load then whatever escape
> character you choose will appear sometimes.
If we allow an 8-bit character set in the "text" file, then yes, any
On Thu, Jun 02, 2005 at 07:33:13AM -0700, Luke Lonergan wrote:
> Oliver,
>
> > Haven't you just replaced one preprocessing step with another, then?
>
> Generally not. The most common problem with the current choice of escape
> character is that there are *lots* of data load scenarios with backsl
Oliver,
> Haven't you just replaced one preprocessing step with another, then?
Generally not. The most common problem with the current choice of escape
character is that there are *lots* of data load scenarios with backslash in
the text strings. The extra preprocessing to escape them is unneces
Andreas,
>> Escape processing would proceed as before, but the semantics would change to
>> allow the use of different characters as the escape character, in addition
>> to the special characters for delimiter and newline.
>
> If you mean syntax to specify escape and delimiter (and newline ?),
>
> Escape processing would proceed as before, but the semantics would change to
> allow the use of different characters as the escape character, in addition
> to the special characters for delimiter and newline.
If you mean syntax to specify escape and delimiter (and newline ?),
that is a great a
On K, 2005-06-01 at 11:31 -0400, Tom Lane wrote:
> Hannu Krosing <[EMAIL PROTECTED]> writes:
> > I think this should be a decision done when creating a table, just like
> > TEMP tables. So you always know if a certain table is or is not
> > safe/replicated/recoverable.
> > This has also the advanta
On K, 2005-06-01 at 18:05 -0400, Bruce Momjian wrote:
> What we could do is to do no-WAL automatically for empty tables (like
> when a database is first loaded),
You forget that some databases use WAL for PITR / replication and doing
it automatically there would surely mess up their replica.
How
Alon Goldshuv wrote:
> I think that the basic issue is that there are some database users that would
> like to take their data and put it into the database without pre-processing
> it [...]
> The only responsibility of these users is to explicitly escape any delimiter
> or 0x0A (LF) characters
On Thu, Jun 02, 2005 at 12:30:01AM -0400, Alon Goldshuv wrote:
> before that, let me just clarify that the performance improvements in
> the patch have nothing to do with the escaping mechanizm. Escapes
> could change. The performance gain in due to a buffered processing
> with minimal line/attrib
Steve,
Oops. Example below should have read differently:
> Sample with 2 identical rows (with binary representations depicted between
> <>):
> Blahblah<0xaa>blahblah<0x09>blahblah<0x00>blahblah<0xaa>Blahblah<0xaa>blahbl
> ah<0x09>blahblah<0x00>blahblah<0xaa>
Blahblah<0x0a>blahblah<0x09>blahbla
eve Atkins
Sent: Wed 6/1/2005 10:47 PM
To: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] NOLOGGING option, or ?
On Wed, Jun 01, 2005 at 07:35:33PM -0700, Luke Lonergan wrote:
> >> I propose an extended syntax to COPY with a change in semantics to remove
> >> the defa
ehalf of Steve Atkins
Sent: Wed 6/1/2005 10:47 PM
To: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] NOLOGGING option, or ?
On Wed, Jun 01, 2005 at 07:35:33PM -0700, Luke Lonergan wrote:
> >> I propose an extended syntax to COPY with a change in semantics to remove
> >> the de
Steve,
> I've been following this thread, and I'm a little confused. Could you
> possibly clarify what you mean, by providing a couple of lines of
> input as it would be formatted with escape processing turned off -
> containing a text field with an embedded newline and tab and a null field.
Usin
On Wed, Jun 01, 2005 at 10:35:30AM -0700, Alon Goldshuv wrote:
> 2) A modified command syntax for introducing a direct single row error
> handling. By direct I mean - a row that if rejected from within the COPY
> command context does not throw an error and rollsback the whole transaction.
> Instea
On Wed, Jun 01, 2005 at 07:35:33PM -0700, Luke Lonergan wrote:
> >> I propose an extended syntax to COPY with a change in semantics to remove
> >> the default of "WITH ESCAPE '\'".
> >
> > Er, doesn't this break existing database dumps?
>
> Yes, one of the previously stated reasons to create anot
> How do you distinguis an escape-delimiter used to escape a delimiter in
> the data from a literal escape-delimiter in the data being loaded --- it
> seems impossible to do.
The usual approach is fine - a pair of escapes is equivalent to a literal
escape.
> Or are you saying there would be no
Oliver,
On 6/1/05 7:13 PM, "Oliver Jowett" <[EMAIL PROTECTED]> wrote:
> Luke Lonergan wrote:
>
>> I propose an extended syntax to COPY with a change in semantics to remove
>> the default of "WITH ESCAPE '\'".
>
> Er, doesn't this break existing database dumps?
Yes, one of the previously stated
Oliver Jowett wrote:
> Luke Lonergan wrote:
>
> > I propose an extended syntax to COPY with a change in semantics to remove
> > the default of "WITH ESCAPE '\'".
>
> Er, doesn't this break existing database dumps?
Right, we will not change the default.
--
Bruce Momjian
Luke Lonergan wrote:
>
> >
> > Yep, we would _love_ those improvements.
> >
>
> Coming soon, probably from the guy you've never heard of :-)
LOL
> >
> > I am confused why you are confused. :-)
> >
> > Uh, how do you do the escapes if you don't double the escape character
> > on input so yo
Luke Lonergan wrote:
I propose an extended syntax to COPY with a change in semantics to remove
the default of "WITH ESCAPE '\'".
Er, doesn't this break existing database dumps?
-O
---(end of broadcast)---
TIP 3: if posting/reading through Usene
>
> Yep, we would _love_ those improvements.
>
Coming soon, probably from the guy you've never heard of :-)
>
> I am confused why you are confused. :-)
>
> Uh, how do you do the escapes if you don't double the escape character
> on input so you can distinguish a literal escape from one use
Bruce,
> Yep, you have to double backslahses coming in as data so we can use
> backslash for marking null, delimiters, etc. I see no way around that,
> and no one since Berkeley has come up with one either. If you have an
> idea, we would _love_ to hear it.
Apropos to the previous thread on esc
Luke Lonergan wrote:
> Tom,
> >
> > ... and instead, define some new behavior that will soon be considered
> > broken legacy code itself?
>
> I'll not argue further about whether to have a separate LOAD command.
> That's not as important as fixing the performance issues in the data load
> path in
Alon Goldshuv wrote:
> Bruce,
>
> The patch is not there to show that something is "broken" is it there to
> show how things could be done in another way, which may or may not be more
> desireable.
Sure, we are always looking for ways to improve things.
> > COPY works as designed. The idea that
Tom,
>
> ... and instead, define some new behavior that will soon be considered
> broken legacy code itself?
I'll not argue further about whether to have a separate LOAD command.
That's not as important as fixing the performance issues in the data load
path in PostgreSQL to me.
However, I find i
Bruce,
The patch is not there to show that something is "broken" is it there to
show how things could be done in another way, which may or may not be more
desireable.
> COPY works as designed. The idea that some guy we have never heard of
> is going to appear and rewrite COPY's processing and te
Simon Riggs wrote:
On Wed, 2005-06-01 at 10:35 -0700, Alon Goldshuv wrote:
2) A modified command syntax for introducing a direct single row error
handling. By direct I mean - a row that if rejected from within the COPY
5) allow an ERRORLIMIT to allow control of aborting a load after a certain
Simon Riggs wrote:
> I'm not clear from all of those options whether we still need a LOAD
> command, based upon other issues/comments raised on this thread.
>
> However, there are some other arguments for why it might be a good idea
> to have a LOAD DATA command separate from COPY. Certainly long
Tom Lane wrote:
> Bruce Momjian writes:
> > One idea would be to look at the table file size first. If it has zero
> > blocks, lock the table and if it still has zero blocks, do the no-WAL
> > copy.
>
> I think that's a bad idea. It would make the behavior unpredictable
> --- sometimes a COPY w
Luke Lonergan wrote:
> Bruce,
>
> > The problem with a new command is that it becomes unclear when you
> > should use COPY and when LOAD DATA, and it confuses users, and has
> > maintenance overhead. If Bizgres wants a new command name, go for it,
> > but it is unlikely that the community release
On Wed, 2005-06-01 at 18:32 -0400, Tom Lane wrote:
> Bruce Momjian writes:
> > One idea would be to look at the table file size first. If it has zero
> > blocks, lock the table and if it still has zero blocks, do the no-WAL
> > copy.
>
> I think that's a bad idea. It would make the behavior unp
On Wed, 2005-06-01 at 10:35 -0700, Alon Goldshuv wrote:
> I have been working on improving the COPY command performance
> Around 40% for 15 column (mixed types) table.
> Around 90% for 1 column table.
Thats very cool.
> 2) A modified command syntax for introducing a direct single row error
> h
"Luke Lonergan" <[EMAIL PROTECTED]> writes:
> One of the reasons to consider a LOAD DATA command is that we can isolate
> the need for performance improvements and special syntax from the concerns
> of preserving the legacy behavior of COPY for use as the primary mechanism
> for DUMP and RESTORE.
Bruce Momjian writes:
> One idea would be to look at the table file size first. If it has zero
> blocks, lock the table and if it still has zero blocks, do the no-WAL
> copy.
I think that's a bad idea. It would make the behavior unpredictable
--- sometimes a COPY will take an exclusive lock, an
Bruce,
> The problem with a new command is that it becomes unclear when you
> should use COPY and when LOAD DATA, and it confuses users, and has
> maintenance overhead. If Bizgres wants a new command name, go for it,
> but it is unlikely that the community release is going to go in that
> directi
Tom Lane wrote:
> Bruce Momjian writes:
> > Tom Lane wrote:
> >> Not unless you are proposing to change COPY to acquire a lock strong
> >> enough to lock out other writers to the table for the duration ...
>
> > Well, if the table is initally empty, what harm is there in locking the
> > table?
>
Alon Goldshuv wrote:
> 9) allow for Simon's WAL bypass.
>
> I have surely missed some problems that hide behind the idea, but these
> points make me believe that LOAD DATA is a good idea.
The community is unlikely to add a new LOAD DATA command that does
_almost_ everything COPY does. We are muc
Jochem van Dieten wrote:
> On 6/1/05, Bruce Momjian wrote:
> > Jochem van Dieten wrote:
> >>
> >> Why only on an empty table? What is the problem with bypassing WAL on
> >> any table as long as all files of that table are fsync'ed before
> >> commit?
> >
> > Because adding rows to a table might mo
On Wed, 2005-06-01 at 14:14 -0400, Tom Lane wrote:
> So what I'm thinking is we need no special WAL entries for this. What
> we need is just an operating mode of COPY in which it doesn't WAL-log
> its inserts, but instead fsyncs before completion, much like index build
> does. For safety it must
On Wed, 2005-06-01 at 14:24 -0400, Tom Lane wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > On Wed, Jun 01, 2005 at 06:55:46PM +0100, Simon Riggs wrote:
> >> We're holding the table lock and will continue to do so until end of
> >> transaction. No transaction with an earlier id will ever se
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> On Wed, Jun 01, 2005 at 06:55:46PM +0100, Simon Riggs wrote:
>> We're holding the table lock and will continue to do so until end of
>> transaction. No transaction with an earlier id will ever see the data we
>> load because of the lock.
> Suppose you l
Simon Riggs <[EMAIL PROTECTED]> writes:
> If the server crashes, we replay WAL. If we see a load start message, we
> truncate the relation and note that a load has started. If there is WAL
> data for the tuples, we replay it. If WAL replay ends without the load
> transaction having successfully com
On Wed, Jun 01, 2005 at 06:55:46PM +0100, Simon Riggs wrote:
> On Wed, 2005-06-01 at 11:55 -0400, Tom Lane wrote:
> > Simon Riggs <[EMAIL PROTECTED]> writes:
> > > 4. Optionally, we set a flag on the table showing the whole table is
> > > frozen. Anybody writing to this table subsequently will spoi
On Wed, 2005-06-01 at 11:55 -0400, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > 4. Optionally, we set a flag on the table showing the whole table is
> > frozen. Anybody writing to this table subsequently will spoil this flag.
> > If the flag is set, all forms of VACUUM will return
I have been working on improving the COPY command performance and as a
result also came up with other thoughts that may possibly be better off
implemented in a new command (i.e LOAD DATA) rather than adding them to the
existing COPY.
The improvements I made were in the COPY parsing logic - changin
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> - it is an option to COPY and CREATE TABLE AS, not GUC, not ALTER TABLE
AFAICS it could just happen automatically for CREATE TABLE AS; there's
no need for an option there, any more than there is for CREATE INDEX.
The only reason it needs to be an expli
On Wed, Jun 01, 2005 at 06:00:28PM +0100, Simon Riggs wrote:
> On Wed, 2005-06-01 at 11:31 -0400, Tom Lane wrote:
> > Perhaps it could work to use an ALTER TABLE command to flip the state.
> > But I'm not really seeing the point compared to treating it as a COPY
> > option. I do not believe that
On Wed, 2005-06-01 at 11:31 -0400, Tom Lane wrote:
> Hannu Krosing <[EMAIL PROTECTED]> writes:
> > I think this should be a decision done when creating a table, just like
> > TEMP tables. So you always know if a certain table is or is not
> > safe/replicated/recoverable.
> > This has also the advan
On 6/1/05, Bruce Momjian wrote:
> Jochem van Dieten wrote:
>>
>> Why only on an empty table? What is the problem with bypassing WAL on
>> any table as long as all files of that table are fsync'ed before
>> commit?
>
> Because adding rows to a table might modify existing pages, and if the
> COPY fa
Simon Riggs <[EMAIL PROTECTED]> writes:
> 4. Optionally, we set a flag on the table showing the whole table is
> frozen. Anybody writing to this table subsequently will spoil this flag.
> If the flag is set, all forms of VACUUM will return success immediately
> without performing a scan (since it i
Bruce Momjian writes:
> Jochem van Dieten wrote:
>> Why only on an empty table? What is the problem with bypassing WAL on
>> any table as long as all files of that table are fsync'ed before
>> commit?
> Because adding rows to a table might modify existing pages, and if the
> COPY fails, you have
Hannu Krosing <[EMAIL PROTECTED]> writes:
> I think this should be a decision done when creating a table, just like
> TEMP tables. So you always know if a certain table is or is not
> safe/replicated/recoverable.
> This has also the advantage of requiring no changes to actual COPY and
> INSERT comm
On Wed, 2005-06-01 at 10:18 -0400, Tom Lane wrote:
> Bruce Momjian writes:
> > Tom Lane wrote:
> >> Not unless you are proposing to change COPY to acquire a lock strong
> >> enough to lock out other writers to the table for the duration ...
>
> > Well, if the table is initally empty, what harm is
On 6/1/05, Hans-Jürgen Schönig <[EMAIL PROTECTED]> wrote:
> Personally I don't think that it is a good idea to do that.
> People will tend to corrupt their systems because they want speed
> (sometimes without thinking about the consequences).
>
> I can only think of one scenario where nologging wo
Bruce Momjian writes:
> Tom Lane wrote:
>> Not unless you are proposing to change COPY to acquire a lock strong
>> enough to lock out other writers to the table for the duration ...
> Well, if the table is initally empty, what harm is there in locking the
> table?
You cannot *know* whether it is
Jochem van Dieten wrote:
> On 01 Jun 2005 04:44:24 -0400, Greg Stark wrote:
> > Greg Stark writes:
> >>
> >> For CREATE TABLE AS in the non-PITR case you don't really need to WAL log
> >> the
> >> records at all. If it fails in the middle you just drop the table. When it
> >> completes you do a c
Tom Lane wrote:
> Greg Stark <[EMAIL PROTECTED]> writes:
> > Sorry to followup to my own message, but it occurs to me that COPY could be
> > made to automatically do this for the case of an empty destination table
> > too.
>
> Not unless you are proposing to change COPY to acquire a lock strong
>
On 01 Jun 2005 04:44:24 -0400, Greg Stark wrote:
> Greg Stark writes:
>>
>> For CREATE TABLE AS in the non-PITR case you don't really need to WAL log the
>> records at all. If it fails in the middle you just drop the table. When it
>> completes you do a checkpoint before acknowledging the COMMIT.
Simon Riggs <[EMAIL PROTECTED]> writes:
> On Wed, 2005-06-01 at 16:34 +0800, Christopher Kings-Lynne wrote:
> There are some other arguments in favour of a LOAD command Alon?
>>
>> We already have LOAD, so you'll have to choose something else :)
> Its annoying, I grant you. :-)
> LOAD 'libra
Greg Stark <[EMAIL PROTECTED]> writes:
> Sorry to followup to my own message, but it occurs to me that COPY could be
> made to automatically do this for the case of an empty destination table too.
Not unless you are proposing to change COPY to acquire a lock strong
enough to lock out other writers
Simon Riggs wrote:
> On Wed, 2005-06-01 at 04:44 -0400, Greg Stark wrote:
> > Greg Stark <[EMAIL PROTECTED]> writes:
> >
> > > For CREATE TABLE AS in the non-PITR case you don't really need to WAL log
> > > the
> > > records at all. If it fails in the middle you just drop the table. When it
> > >
On K, 2005-06-01 at 00:01 +0100, Simon Riggs wrote:
> Recent test results have shown a substantial performance improvement
> (+25%) if WAL logging is disabled for large COPY statements. This is to
> be expected, though has a price attached: losing the ability to crash
> recover data loaded in this
On Wed, 2005-06-01 at 04:44 -0400, Greg Stark wrote:
> Greg Stark <[EMAIL PROTECTED]> writes:
>
> > For CREATE TABLE AS in the non-PITR case you don't really need to WAL log
> > the
> > records at all. If it fails in the middle you just drop the table. When it
> > completes you do a checkpoint be
On Wed, 2005-06-01 at 16:34 +0800, Christopher Kings-Lynne wrote:
> > There are some other arguments in favour of a LOAD command Alon?
>
> We already have LOAD, so you'll have to choose something else :)
Its annoying, I grant you. :-)
LOAD 'library' would still need to be the default.
LOAD
On K, 2005-06-01 at 09:16 +0100, Simon Riggs wrote:
> On Tue, 2005-05-31 at 22:47 -0400, Tom Lane wrote:
> > Simon Riggs <[EMAIL PROTECTED]> writes:
> > > Recent test results have shown a substantial performance improvement
> > > (+25%) if WAL logging is disabled for large COPY statements.
> >
> >
Greg Stark <[EMAIL PROTECTED]> writes:
> For CREATE TABLE AS in the non-PITR case you don't really need to WAL log the
> records at all. If it fails in the middle you just drop the table. When it
> completes you do a checkpoint before acknowledging the COMMIT.
>
> I think this is already done fo
There are some other arguments in favour of a LOAD command Alon?
We already have LOAD, so you'll have to choose something else :)
Chris
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscr
Neil Conway <[EMAIL PROTECTED]> writes:
> On Wed, 2005-06-01 at 00:40 -0400, Alvaro Herrera wrote:
> > This doesn't work for COPY, but maybe for CREATE TABLE AS we could log
> > the fact that the command was executed, so the replayer could execute
> > the same command again.
> >
> > Of course, t
On Tue, 2005-05-31 at 22:47 -0400, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > Recent test results have shown a substantial performance improvement
> > (+25%) if WAL logging is disabled for large COPY statements.
>
> How much of that is left after we fix the 64-bit-CRC issue?
We
Personally I don't think that it is a good idea to do that.
People will tend to corrupt their systems because they want speed
(sometimes without thinking about the consequences).
I can only think of one scenario where nologging would actually make
sense: Many people use session tables to keep
On Wed, 2005-06-01 at 00:40 -0400, Alvaro Herrera wrote:
> This doesn't work for COPY, but maybe for CREATE TABLE AS we could log
> the fact that the command was executed, so the replayer could execute
> the same command again.
>
> Of course, this handwaving doesn't explain how the system in recov
On Tue, May 31, 2005 at 10:47:30PM -0400, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > Recent test results have shown a substantial performance improvement
> > (+25%) if WAL logging is disabled for large COPY statements.
> BTW, I'm sure you are the last one who needs to be reminde
Simon Riggs <[EMAIL PROTECTED]> writes:
> Recent test results have shown a substantial performance improvement
> (+25%) if WAL logging is disabled for large COPY statements.
How much of that is left after we fix the 64-bit-CRC issue?
> Now, I would like to discuss adding an enable_logging USERSET
Recent test results have shown a substantial performance improvement
(+25%) if WAL logging is disabled for large COPY statements. This is to
be expected, though has a price attached: losing the ability to crash
recover data loaded in this manner.
There are two parts to this proposal. First, when a
84 matches
Mail list logo