Re: Detecting File Damage & Inconsistencies

2021-07-31 Thread Amit Kapila
On Thu, Jul 22, 2021 at 7:10 PM Simon Riggs
 wrote:
>
> On Wed, 14 Jul 2021 at 05:01, Amit Kapila  wrote:
>
> But I do understand the overall request, so I propose adding
> * pg_waldump output for wal_sessioninfo data, if it exists
> * pg_waldump --user=USERNAME as a filter on username
> to demonstrate the use of this
>

This makes sense but I am still thinking of some more concrete way.
Can we think of providing a way to filter WAL from user/process (like
filter_by_origin) for logical decoding? If so, then we can have an
example to show via test_decoding.

-- 
With Regards,
Amit Kapila.




Re: Detecting File Damage & Inconsistencies

2021-07-22 Thread Simon Riggs
On Wed, 14 Jul 2021 at 05:01, Amit Kapila  wrote:

> The patch is simple but its use doesn't seem to be very clear. You
> have mentioned its use for future PITR patches and Craig mentioned
> some use cases in logical decoding and it appears to me that to
> support the use cases mentioned by Craig, it is important to LOG this
> earlier than at commit time. As there are no details about how it will
> be used for PITR patches and whether such patch ideas are accepted, it
> makes it harder to judge the value of this patch.

> I think if we would have patches (even at WIP/POC stage) for the ideas
> you and Craig have in mind, it would have been much easier to see the
> value of this patch.

Fair question. This is one of a series of 4 independent patches I have
planned to provide enhanced information and/or recovery tools. (The
second one is already in this CF). This is an area I know lots about
and nobody else is working on, so I thought I would contribute. I've
not discussed this off-list with anyone else. So this is PITR as
recovery in a broad sense, not just that specific feature.

For this patch, the idea is to be able to go in either direction:
Data damage <--> User info

So if you know a user was an intruder, you can detect the damage they caused.
Or, if you detect damage, you can work out who caused it, work out if
they were an intruder and if so, detect what else they did.

The most important thing is to have the info available in WAL, nothing
is possible until that is available.
We already added an option to add this same info to log_line_prefix,
yet nobody said it wasn't useful there, or needed other uses to allow
the feature.
The two sources of info are designed to be able to be used in combination.

My experience of recovery scenarios is that you often have to build
custom search tools to make it work. It's hard to say whether you'll
want to track the user, the specific session, or even specific
transactions.

But I do understand the overall request, so I propose adding
* pg_waldump output for wal_sessioninfo data, if it exists
* pg_waldump --user=USERNAME as a filter on username
to demonstrate the use of this

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Re: Detecting File Damage & Inconsistencies

2021-07-13 Thread Amit Kapila
On Tue, Jul 13, 2021 at 8:29 PM Simon Riggs
 wrote:
>
> On Tue, Jul 6, 2021 at 4:21 AM Amit Kapila  wrote:
> >
> > >
> > > > If you don't think the sorts of use cases I presented are worth the 
> > > > trouble that's fair enough. I'm not against adding it on the commit 
> > > > record. It's just that with logical decoding moving toward a streaming 
> > > > model I suspect only having it at commit time may cause us some pain 
> > > > later.
> > >
> > > I think you have some good ideas about how to handle larger
> > > transactions with streaming. As a separate patch it might be worth
> > > keeping track of transaction size and logging something when a
> > > transaction gets too large.
> > >
> >
> > If we want this additional information for streaming mode in logical
> > replication then can't we piggyback it on the very first record
> > written for a transaction when this info is required? Currently, we do
> > something similar for logging top_level_xid for subtransaction in
> > XLogRecordAssemble().
>
> It's possible, but I'm struggling to believe anybody would accept that
> as an approach because it breaks simplicity, modularity and makes it
> harder to search for this info in the WAL.
>
> I was imagining that we'd keep track of amount of WAL written by a
> transaction and when it reaches a certain size generate a "streaming
> info" record as an early warning that we have a big transaction coming
> down the pipe.
>

I am not sure if that satisfies Craig's requirement of making it
available during the streaming of in-progress xacts during logical
replication. It is quite possible that by the time we decide to start
streaming a transaction this information won't be logged yet.

> I'm feeling that a simple patch is expanding well beyond its original
> scope and timeline. How can we do this simply?
>

The patch is simple but its use doesn't seem to be very clear. You
have mentioned its use for future PITR patches and Craig mentioned
some use cases in logical decoding and it appears to me that to
support the use cases mentioned by Craig, it is important to LOG this
earlier than at commit time. As there are no details about how it will
be used for PITR patches and whether such patch ideas are accepted, it
makes it harder to judge the value of this patch.

I think if we would have patches (even at WIP/POC stage) for the ideas
you and Craig have in mind, it would have been much easier to see the
value of this patch.

-- 
With Regards,
Amit Kapila.




Re: Detecting File Damage & Inconsistencies

2021-07-13 Thread Simon Riggs
On Tue, Jul 6, 2021 at 4:21 AM Amit Kapila  wrote:
>
> On Fri, Jul 2, 2021 at 8:29 PM Simon Riggs  
> wrote:
> >
> > On Fri, Jul 2, 2021 at 5:34 AM Craig Ringer
> >  wrote:
> > >
> >
> > > If you don't think the sorts of use cases I presented are worth the 
> > > trouble that's fair enough. I'm not against adding it on the commit 
> > > record. It's just that with logical decoding moving toward a streaming 
> > > model I suspect only having it at commit time may cause us some pain 
> > > later.
> >
> > I think you have some good ideas about how to handle larger
> > transactions with streaming. As a separate patch it might be worth
> > keeping track of transaction size and logging something when a
> > transaction gets too large.
> >
>
> If we want this additional information for streaming mode in logical
> replication then can't we piggyback it on the very first record
> written for a transaction when this info is required? Currently, we do
> something similar for logging top_level_xid for subtransaction in
> XLogRecordAssemble().

It's possible, but I'm struggling to believe anybody would accept that
as an approach because it breaks simplicity, modularity and makes it
harder to search for this info in the WAL.

I was imagining that we'd keep track of amount of WAL written by a
transaction and when it reaches a certain size generate a "streaming
info" record as an early warning that we have a big transaction coming
down the pipe.

I'm feeling that a simple patch is expanding well beyond its original
scope and timeline. How can we do this simply?

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Re: Detecting File Damage & Inconsistencies

2021-07-05 Thread Amit Kapila
On Fri, Jul 2, 2021 at 8:29 PM Simon Riggs  wrote:
>
> On Fri, Jul 2, 2021 at 5:34 AM Craig Ringer
>  wrote:
> >
>
> > If you don't think the sorts of use cases I presented are worth the trouble 
> > that's fair enough. I'm not against adding it on the commit record. It's 
> > just that with logical decoding moving toward a streaming model I suspect 
> > only having it at commit time may cause us some pain later.
>
> I think you have some good ideas about how to handle larger
> transactions with streaming. As a separate patch it might be worth
> keeping track of transaction size and logging something when a
> transaction gets too large.
>

If we want this additional information for streaming mode in logical
replication then can't we piggyback it on the very first record
written for a transaction when this info is required? Currently, we do
something similar for logging top_level_xid for subtransaction in
XLogRecordAssemble().


-- 
With Regards,
Amit Kapila.




Re: Detecting File Damage & Inconsistencies

2021-07-02 Thread Simon Riggs
On Fri, Jul 2, 2021 at 5:34 AM Craig Ringer
 wrote:
>
> On Fri, 2 Jul 2021 at 00:19, Simon Riggs  wrote:
>
>>
>> > So yeah. I think it'd be better to log the info you want at start-of-txn 
>> > unless there's a compelling reason not so, and I don't see one yet.
>>
>> AFAIK, XLOG_XACT_ASSIGNMENT does not occur for normal top-level
>> transactions, only for subxids.
>>
>> I don't really want to add an extra record just for this because it
>> will slow down applications and it won't get turned on as often.
>
>
> OK, that makes sense - I was indeed operating on an incorrect assumption.
>
> I wouldn't want to add a new record either. I thought we could piggyback on 
> XLOG_XACT_ASSIGNMENT with a new chunk that's only added when the feature is 
> required, much like we do for replication origin info on commit records.
>
> Is it worth considering forcing XLOG_XACT_ASSIGNMENT to be logged if this 
> feature is enabled?

My feeling is that the drop in performance would lead to it being
turned off most of the time, reducing the value of the feature.

Does anyone else disagree?

> If you don't think the sorts of use cases I presented are worth the trouble 
> that's fair enough. I'm not against adding it on the commit record. It's just 
> that with logical decoding moving toward a streaming model I suspect only 
> having it at commit time may cause us some pain later.

I think you have some good ideas about how to handle larger
transactions with streaming. As a separate patch it might be worth
keeping track of transaction size and logging something when a
transaction gets too large.

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Re: Detecting File Damage & Inconsistencies

2021-07-01 Thread Craig Ringer
On Fri, 2 Jul 2021 at 00:19, Simon Riggs 
wrote:


> > So yeah. I think it'd be better to log the info you want at start-of-txn
> unless there's a compelling reason not so, and I don't see one yet.
>
> AFAIK, XLOG_XACT_ASSIGNMENT does not occur for normal top-level
> transactions, only for subxids.
>
> I don't really want to add an extra record just for this because it
> will slow down applications and it won't get turned on as often.
>

OK, that makes sense - I was indeed operating on an incorrect assumption.

I wouldn't want to add a new record either. I thought we could piggyback on
XLOG_XACT_ASSIGNMENT with a new chunk that's only added when the feature is
required, much like we do for replication origin info on commit records.

Is it worth considering forcing XLOG_XACT_ASSIGNMENT to be logged if this
feature is enabled?

If you don't think the sorts of use cases I presented are worth the trouble
that's fair enough. I'm not against adding it on the commit record. It's
just that with logical decoding moving toward a streaming model I suspect
only having it at commit time may cause us some pain later.


Re: Detecting File Damage & Inconsistencies

2021-07-01 Thread Simon Riggs
On Tue, Jun 22, 2021 at 6:32 AM Craig Ringer
 wrote:

> IIRC the restart_lsn horizon already ensures that we can't miss the 
> xl_xact_assignment at the start of a txn. We would ensure that the desired 
> info is available throughout decoding of the txn, including at commit record 
> processing time, by adding it to the toplevel ReorderBufferTxn.
>
> The only advantage I can see to annotating the commit record instead is that 
> we don't have to spend a few bytes per reorder-buffered topxid to track this 
> info between start of decoding for the tx and processing of the commit 
> record. I don't think that's worth caring about.The advantages that having it 
> earlier would give us are much more significant.
>
> A few examples:
>
> * Skip reorder buffering of non-target transactions early, so we can decode 
> the WAL stream to find the target transactions much faster using less memory 
> and I/O;
>
> * Read the database change stream and use the session info to stream info 
> into an intrusion detection system and/or audit engine in real time, using 
> txn streaming to avoid the need to create huge reorder buffers;
>
> * Re-decode the WAL stream to identify a target txn you know was aborted, and 
> commit it instead, so you can recover data from aborted txns from the WAL 
> stream using logical decoding. (Only possible if the catalog_xmin hasn't 
> advanced past that point already though)
>
> So yeah. I think it'd be better to log the info you want at start-of-txn 
> unless there's a compelling reason not so, and I don't see one yet.

AFAIK, XLOG_XACT_ASSIGNMENT does not occur for normal top-level
transactions, only for subxids.

I don't really want to add an extra record just for this because it
will slow down applications and it won't get turned on as often.

Thoughts?

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Re: Detecting File Damage & Inconsistencies

2021-06-21 Thread Craig Ringer
On Tue, 22 Jun 2021 at 00:24, Simon Riggs 
wrote:

> On Thu, Mar 18, 2021 at 6:20 AM Craig Ringer
>  wrote:
> >
> > On Mon, 15 Mar 2021 at 21:01, David Steele  wrote:
> >>
> >> On 11/18/20 5:23 AM, Simon Riggs wrote:
> >> > On Wed, 18 Nov 2020 at 06:42, Craig Ringer
> >> >  wrote:
> >> >>
> >> >> On Fri, Nov 13, 2020 at 7:24 PM Simon Riggs 
> wrote:
> >> >>>
> >> >>>
> >> >>> What I'm proposing is an option to add 16 bytes onto each COMMIT
> >> >>> record
> >> >>
> >> >>
> >> >> Would it make sense to write this at the time we write a topxid
> assignment to WAL instead?
> >> >>
> >> >> Otherwise it won't be accessible to streaming-mode logical decoding.
> >> >
> >> > Do you mean extend the xl_xact_assignment record? My understanding is
> >> > that is not sent in all cases, so not sure what you mean by "instead".
> >>
> >> Craig, can you clarify?
> >
> >
> > Right. Or write a separate WAL record when the feature is enabled. But
> it's probably sufficient to write it as an optional chunk on
> xl_xact_assignment records. We often defer writing them so we can optimise
> away xacts that never actually wrote anything, but IIRC we still write one
> before we write any WAL that references the xid. That'd be fine, since we
> don't need the info any sooner than that during decoding. I'd have to
> double check that we write it in all cases and won't get to that too soon,
> but I'm pretty sure we do...
>
> The commit record is optimized away if no xid is assigned, though is
> still present if we didn't write any WAL records.
>
> But if a commit record exists in the WAL stream, we want to know where
> it came from.
>
> A later patch will add PITR capability based on this information so
> attaching it directly to the commit record is fairly important, IMHO.
>

Why?

All the proposed info:

* 8-byte session start time (from MyStartTime)
* 2-byte pid (from MyProcPid)
* 4-byte user oid

are available at topxid assignment time. If we defer writing them until
commit, we lose the ability to use this information during streaming
logical decoding. That's something I believe you've wanted for other
functionality in the past, such as logical decoding based audit
functionality.

IIRC the restart_lsn horizon already ensures that we can't miss the
xl_xact_assignment at the start of a txn. We would ensure that the desired
info is available throughout decoding of the txn, including at commit
record processing time, by adding it to the toplevel ReorderBufferTxn.

The only advantage I can see to annotating the commit record instead is
that we don't have to spend a few bytes per reorder-buffered topxid to
track this info between start of decoding for the tx and processing of the
commit record. I don't think that's worth caring about.The advantages that
having it earlier would give us are much more significant.

A few examples:

* Skip reorder buffering of non-target transactions early, so we can decode
the WAL stream to find the target transactions much faster using less
memory and I/O;

* Read the database change stream and use the session info to stream info
into an intrusion detection system and/or audit engine in real time, using
txn streaming to avoid the need to create huge reorder buffers;

* Re-decode the WAL stream to identify a target txn you know was aborted,
and commit it instead, so you can recover data from aborted txns from the
WAL stream using logical decoding. (Only possible if the catalog_xmin
hasn't advanced past that point already though)

So yeah. I think it'd be better to log the info you want at start-of-txn
unless there's a compelling reason not so, and I don't see one yet.


Re: Detecting File Damage & Inconsistencies

2021-06-21 Thread Simon Riggs
On Thu, Mar 18, 2021 at 6:20 AM Craig Ringer
 wrote:
>
> On Mon, 15 Mar 2021 at 21:01, David Steele  wrote:
>>
>> On 11/18/20 5:23 AM, Simon Riggs wrote:
>> > On Wed, 18 Nov 2020 at 06:42, Craig Ringer
>> >  wrote:
>> >>
>> >> On Fri, Nov 13, 2020 at 7:24 PM Simon Riggs  wrote:
>> >>>
>> >>>
>> >>> What I'm proposing is an option to add 16 bytes onto each COMMIT
>> >>> record
>> >>
>> >>
>> >> Would it make sense to write this at the time we write a topxid 
>> >> assignment to WAL instead?
>> >>
>> >> Otherwise it won't be accessible to streaming-mode logical decoding.
>> >
>> > Do you mean extend the xl_xact_assignment record? My understanding is
>> > that is not sent in all cases, so not sure what you mean by "instead".
>>
>> Craig, can you clarify?
>
>
> Right. Or write a separate WAL record when the feature is enabled. But it's 
> probably sufficient to write it as an optional chunk on xl_xact_assignment 
> records. We often defer writing them so we can optimise away xacts that never 
> actually wrote anything, but IIRC we still write one before we write any WAL 
> that references the xid. That'd be fine, since we don't need the info any 
> sooner than that during decoding. I'd have to double check that we write it 
> in all cases and won't get to that too soon, but I'm pretty sure we do...

The commit record is optimized away if no xid is assigned, though is
still present if we didn't write any WAL records.

But if a commit record exists in the WAL stream, we want to know where
it came from.

A later patch will add PITR capability based on this information so
attaching it directly to the commit record is fairly important, IMHO.

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Re: Detecting File Damage & Inconsistencies

2021-03-18 Thread Craig Ringer
On Mon, 15 Mar 2021 at 21:01, David Steele  wrote:

> On 11/18/20 5:23 AM, Simon Riggs wrote:
> > On Wed, 18 Nov 2020 at 06:42, Craig Ringer
> >  wrote:
> >>
> >> On Fri, Nov 13, 2020 at 7:24 PM Simon Riggs 
> wrote:
> >>>
> >>>
> >>> What I'm proposing is an option to add 16 bytes onto each COMMIT
> >>> record
> >>
> >>
> >> Would it make sense to write this at the time we write a topxid
> assignment to WAL instead?
> >>
> >> Otherwise it won't be accessible to streaming-mode logical decoding.
> >
> > Do you mean extend the xl_xact_assignment record? My understanding is
> > that is not sent in all cases, so not sure what you mean by "instead".
>
> Craig, can you clarify?
>

Right. Or write a separate WAL record when the feature is enabled. But it's
probably sufficient to write it as an optional chunk on xl_xact_assignment
records. We often defer writing them so we can optimise away xacts that
never actually wrote anything, but IIRC we still write one before we write
any WAL that references the xid. That'd be fine, since we don't need the
info any sooner than that during decoding. I'd have to double check that we
write it in all cases and won't get to that too soon, but I'm pretty sure
we do...


Re: Detecting File Damage & Inconsistencies

2021-03-15 Thread David Steele

On 11/18/20 5:23 AM, Simon Riggs wrote:

On Wed, 18 Nov 2020 at 06:42, Craig Ringer
 wrote:


On Fri, Nov 13, 2020 at 7:24 PM Simon Riggs  wrote:



What I'm proposing is an option to add 16 bytes onto each COMMIT
record



Would it make sense to write this at the time we write a topxid assignment to 
WAL instead?

Otherwise it won't be accessible to streaming-mode logical decoding.


Do you mean extend the xl_xact_assignment record? My understanding is
that is not sent in all cases, so not sure what you mean by "instead".


Craig, can you clarify?

Cleysson, you are signed up as a reviewer. Do you know when you'll have 
a change to have a look?


Regards,
--
-David
da...@pgmasters.net




Re: Detecting File Damage & Inconsistencies

2020-12-28 Thread Masahiko Sawada
Hi Simon,

On Wed, Nov 18, 2020 at 2:14 AM Simon Riggs  wrote:
>
> On Fri, 13 Nov 2020 at 11:24, Simon Riggs  wrote:
> >
> > On Fri, 13 Nov 2020 at 00:50, tsunakawa.ta...@fujitsu.com
> >  wrote:
> > >
> > > From: Simon Riggs 
> > > > If a rogue user/process is suspected, this would allow you to identify
> > > > more easily the changes made by specific sessions/users.
> > >
> > > Isn't that kind of auditing a job of pgAudit or log_statement = mod?  Or, 
> > > does "more easily" mean that you find pgAudit complex to use and/or 
> > > log_statement's overhead is big?
> >
> > Well, I designed pgaudit, so yes, I think pgaudit is useful.
> >
> > However, pgaudit works at the statement level, not the data level. So
> > using pgaudit to locate data rows that have changed is fairly hard.
> >
> > What I'm proposing is an option to add 16 bytes onto each COMMIT
> > record, which is considerably less than turning on full auditing in
> > pgaudit. This option would allow identifying data at the row level, so
> > you could for example find all rows changed by specific sessions.
> > Also, because it is stored in WAL it will show updates that might no
> > longer exist in the database because the changed row versions might
> > have been vacuumed away. So pgaudit will tell you that happened, but
> > having extra info in WAL is important also.
> >
> > So thank you for the question because it has allowed me to explain why
> > it is useful and important.
>
> Patch attached to implement "wal_sessioninfo" option.

You sent in your patch, wal_sessioninfo.v2.patch to pgsql-hackers on
Nov 18, but you did not post it to the next CommitFest[1].  If this
was intentional, then you need to take no action.  However, if you
want your patch to be reviewed as part of the upcoming CommitFest,
then you need to add it yourself before 2021-01-01 AoE[2]. Thanks for
your contributions.

Regards,

[1] https://commitfest.postgresql.org/31/
[2] https://en.wikipedia.org/wiki/Anywhere_on_Earth

-- 
Masahiko Sawada
EnterpriseDB:  https://www.enterprisedb.com/




Re: Detecting File Damage & Inconsistencies

2020-11-18 Thread Simon Riggs
On Wed, 18 Nov 2020 at 06:42, Craig Ringer
 wrote:
>
> On Fri, Nov 13, 2020 at 7:24 PM Simon Riggs  wrote:
>>
>>
>> What I'm proposing is an option to add 16 bytes onto each COMMIT
>> record
>
>
> Would it make sense to write this at the time we write a topxid assignment to 
> WAL instead?
>
> Otherwise it won't be accessible to streaming-mode logical decoding.

Do you mean extend the xl_xact_assignment record? My understanding is
that is not sent in all cases, so not sure what you mean by "instead".

--
Simon Riggshttp://www.EnterpriseDB.com/




Re: Detecting File Damage & Inconsistencies

2020-11-17 Thread Craig Ringer
On Fri, Nov 13, 2020 at 7:24 PM Simon Riggs  wrote:

>
> What I'm proposing is an option to add 16 bytes onto each COMMIT
> record
>

Would it make sense to write this at the time we write a topxid assignment
to WAL instead?

Otherwise it won't be accessible to streaming-mode logical decoding.


Re: Detecting File Damage & Inconsistencies

2020-11-17 Thread Simon Riggs
On Fri, 13 Nov 2020 at 11:24, Simon Riggs  wrote:
>
> On Fri, 13 Nov 2020 at 00:50, tsunakawa.ta...@fujitsu.com
>  wrote:
> >
> > From: Simon Riggs 
> > > If a rogue user/process is suspected, this would allow you to identify
> > > more easily the changes made by specific sessions/users.
> >
> > Isn't that kind of auditing a job of pgAudit or log_statement = mod?  Or, 
> > does "more easily" mean that you find pgAudit complex to use and/or 
> > log_statement's overhead is big?
>
> Well, I designed pgaudit, so yes, I think pgaudit is useful.
>
> However, pgaudit works at the statement level, not the data level. So
> using pgaudit to locate data rows that have changed is fairly hard.
>
> What I'm proposing is an option to add 16 bytes onto each COMMIT
> record, which is considerably less than turning on full auditing in
> pgaudit. This option would allow identifying data at the row level, so
> you could for example find all rows changed by specific sessions.
> Also, because it is stored in WAL it will show updates that might no
> longer exist in the database because the changed row versions might
> have been vacuumed away. So pgaudit will tell you that happened, but
> having extra info in WAL is important also.
>
> So thank you for the question because it has allowed me to explain why
> it is useful and important.

Patch attached to implement "wal_sessioninfo" option.

-- 
Simon Riggshttp://www.EnterpriseDB.com/


wal_sessioninfo.v2.patch
Description: Binary data


Re: Detecting File Damage & Inconsistencies

2020-11-13 Thread Simon Riggs
On Fri, 13 Nov 2020 at 00:50, tsunakawa.ta...@fujitsu.com
 wrote:
>
> From: Simon Riggs 
> > If a rogue user/process is suspected, this would allow you to identify
> > more easily the changes made by specific sessions/users.
>
> Isn't that kind of auditing a job of pgAudit or log_statement = mod?  Or, 
> does "more easily" mean that you find pgAudit complex to use and/or 
> log_statement's overhead is big?

Well, I designed pgaudit, so yes, I think pgaudit is useful.

However, pgaudit works at the statement level, not the data level. So
using pgaudit to locate data rows that have changed is fairly hard.

What I'm proposing is an option to add 16 bytes onto each COMMIT
record, which is considerably less than turning on full auditing in
pgaudit. This option would allow identifying data at the row level, so
you could for example find all rows changed by specific sessions.
Also, because it is stored in WAL it will show updates that might no
longer exist in the database because the changed row versions might
have been vacuumed away. So pgaudit will tell you that happened, but
having extra info in WAL is important also.

So thank you for the question because it has allowed me to explain why
it is useful and important.

-- 
Simon Riggshttp://www.EnterpriseDB.com/




RE: Detecting File Damage & Inconsistencies

2020-11-12 Thread tsunakawa.ta...@fujitsu.com
From: Simon Riggs 
> If a rogue user/process is suspected, this would allow you to identify
> more easily the changes made by specific sessions/users.

Isn't that kind of auditing a job of pgAudit or log_statement = mod?  Or, does 
"more easily" mean that you find pgAudit complex to use and/or log_statement's 
overhead is big?


Regards
Takayuki Tsunakawa



Re: Detecting File Damage & Inconsistencies

2020-11-11 Thread Simon Riggs
On Thu, 12 Nov 2020 at 06:42, tsunakawa.ta...@fujitsu.com
 wrote:
>
> From: Simon Riggs 
> > I would like to propose a few points that will help us detect file
> > damage, inconsistencies in files and track actions of users.
>
> Hello, Simon san.  Long time no see.  I'm happy to see you be back here 
> recently.

Thank you, happy to be back. It's good to have the time to contribute again.

> What kind of improvement do you expect?  What problems would this make 
> detectable?

If a rogue user/process is suspected, this would allow you to identify
more easily the changes made by specific sessions/users.

> > * 2-byte pid (from MyProcPid)
>
> pid is 4 bytes on Windows.  Isn't it also 4 byte on Linux when some kernel 
> parameter is set to a certain value?

4 bytes is no problem, thanks for pointing that out.

-- 
Simon Riggshttp://www.EnterpriseDB.com/




RE: Detecting File Damage & Inconsistencies

2020-11-11 Thread tsunakawa.ta...@fujitsu.com
From: Simon Riggs 
> I would like to propose a few points that will help us detect file
> damage, inconsistencies in files and track actions of users.

Hello, Simon san.  Long time no see.  I'm happy to see you be back here 
recently.

What kind of improvement do you expect?  What problems would this make 
detectable?


> * 2-byte pid (from MyProcPid)

pid is 4 bytes on Windows.  Isn't it also 4 byte on Linux when some kernel 
parameter is set to a certain value?


Regards
Takayuki Tsunakawa



Re: Detecting File Damage & Inconsistencies

2020-11-11 Thread Jose Luis Tallon

On 11/11/20 21:56, Simon Riggs wrote:

[ŝnip]

REINDEX VERIFY
After the new index is created, but before we drop the old index:
Check whether the two indexes match:
* checks whether the previous index had pointers to row versions that
don't exist
* checks whether the heap has rows that were not in the old index
This approach piggybacks on existing operations. AccessShareLock is
held on both indexes before the old one is dropped.


FWIW, as long as it's optional (due to the added runtime), it'd be a 
welcome feature.


Maybe something along the lines of:

    REINDEX (verify yes) 



Other ideas are welcome.


I still have nightmares from an specific customer case w/ shared storage 
(using VxFS) among two postmaster instances ---supposedly could never be 
active concurrently, not completely sure that it didn't actually 
happen--- and the corruption that we found there. I seem to remember 
that they even had scripts to remove the locking when switching over and 
back :S


I don't think Postgres can do much about this, but maybe someone can 
come up with a suitable countermeasure.



Just my .02€

Thanks,

    / J.L.