On 1/3/25 17:28, Mike Pattrick wrote:
> On Thu, Jan 2, 2025 at 3:42 PM Ilya Maximets <[email protected]> wrote:
>>
>> On 8/13/24 06:03, Mike Pattrick wrote:
>>> Previously ovs-lib would assume a database is valid if the file exists,
>>> however, it is possible for the database file to exist but be unopenable
>>> by ovsdb.
>>>
>>> Now existence checks are augmented with schema checksum validation.
>>> Databases with an invalid schema are removed.
>>>
>>> Reported-at: https://issues.redhat.com/browse/FDP-689
>>> Reported-by: Ihar Hrachyshka
>>> Signed-off-by: Mike Pattrick <[email protected]>
>>> ---
>>> v2:
>>> - Back up database before deleting it
>>> - Use the db-name command to check for validity of file
>>> - Added test to verify that valid clustered databases are detected as
>>> such
>>> ---
>>> tests/ovsdb-server.at | 59 +++++++++++++++++++++++++++++++++++++++++++
>>> utilities/ovs-lib.in | 35 ++++++++++++++++++++-----
>>> 2 files changed, 88 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/tests/ovsdb-server.at b/tests/ovsdb-server.at
>>> index ce6d32aee..18ede1cf9 100644
>>> --- a/tests/ovsdb-server.at
>>> +++ b/tests/ovsdb-server.at
>>> @@ -105,6 +105,65 @@ AT_CHECK([uuidfilt output], [0],
>>> ]], [])
>>> AT_CLEANUP
>>>
>>> +AT_SETUP([truncated database recreated])
>>
>> Maybe "database without valid schema is recreated" ?
>>
>>> +AT_KEYWORDS([ovsdb unix])
>>> +AT_SKIP_IF([test "$IS_WIN32" = "yes"])
>>> +ordinal_schema > schema
>>> +
>>> +dnl Loading ovs-lib resets our PATH, save a copy and then restore.
>>> +_PATH=$PATH
>>> +. ovs-lib
>>> +PATH=$_PATH
>>
>> This seems a little dangerous, though I'm not sure what's a better solution.
>> Do you know where exactly the PATH is getting reset?
>
> In ovs-lib just under the comment "Use the system's own
> implementations if it has any".
>
> Another option would be to rip that code out of ovs-lib. It's not
> clear to me how it would be better to reset the shell state to that
> specified in a file as opposed to however the caller has set up the
> shell. That code is over 10 years old and may have outlived its
> purpose. But I'm not familiar with why it was added in the first
> place.
>
>>
>>> +
>>> +dnl Check that DB is recreated when schema corrupted.
>>> +echo 'x' > db
>>> +AT_CHECK([upgrade_db db schema], [0], [stdout], [ignore])
>>> +AT_CHECK([grep -c "db does not exist\|Creating empty database db" stdout],
>>> [0], [2
>>> +])
>>> +AT_CHECK([validate_db db], [0])
>>> +
>>> +AT_DATA([txnfile], [[ovsdb-client transact unix:socket \
>>> +'["ordinals",
>>> + {"op": "insert",
>>> + "table": "ordinals",
>>> + "row": {"number": 1, "name": "one"}}]'
>>> +]])
>>> +AT_CHECK([ovsdb-server --remote=punix:socket db --run="sh txnfile"], [0],
>>> [stdout], [stderr])
>>> +
>>> +dnl Check that DB is not recreated on corrupted log. This is similar to
>>> the previous test but
>>> +dnl includes a mid-operation upgrade.
>>
>> This is already a multi-line comment. Might as well limit the line length.
>> And double spaces between sentences.
>>
>>> +echo 'xxx' >> db
>>> +AT_CHECK([upgrade_db db schema], [0], [], [ignore])
>>> +
>>> +dnl Validate that the db can now be used.
>>> +AT_DATA([txnfile], [[ovsdb-client transact unix:socket \
>>> +'["ordinals",
>>> + {"op": "select",
>>> + "table": "ordinals",
>>> + "where": []}]'
>>> +]])
>>> +AT_CHECK([ovsdb-server --remote=punix:socket db --run="sh txnfile"], [0],
>>> [stdout], [stderr])
>>> +AT_CHECK([grep 'syntax error: db: parse error.* in header line "xxx"'
>>> stderr], [0], [ignore])
>>
>> With 'grep -q' the ', [0], [ignore]' will not be needed.
>>
>>> +cat stdout >> output
>>
>> Should this be > instead of >> ?
>>
>>> +AT_CHECK([uuidfilt output], [0],
>>> +
>>> [[[{"rows":[{"_uuid":["uuid","<0>"],"_version":["uuid","<1>"],"name":"one","number":1}]}]
>>> +]], [])
>>> +
>>> +dnl Validate then create and join cluster.
>>> +echo 'x' > db
>>> +AT_CHECK([create_cluster db schema tcp:1.1.1.1:1111 1000], [0], [stdout],
>>> [])
>>> +AT_CHECK([grep -Ec 'Backing up database|Creating cluster database db'
>>> stdout], [0], [2
>>> +])
>>> +AT_CHECK([validate_db db], [0])
>>
>> '[0]' is not needed.
>>
>>> +
>>> +dnl Join a cluster with a corrupted db.
>>> +echo 'x' > db
>>> +AT_CHECK([join_cluster db schema tcp:1.1.1.1:1111 tcp:2.2.2.2:2222], [0],
>>> [stdout], [])
>>> +AT_CHECK([grep -Ec 'Backing up database|Joining db to cluster' stdout],
>>> [0], [2
>>> +])
>>> +AT_CHECK([validate_db db], [0])
>>
>> Ditto.
>>
>>> +AT_CLEANUP
>>> +
>>> AT_SETUP([truncating database log with bad transaction])
>>> AT_KEYWORDS([ovsdb server positive unix])
>>> AT_SKIP_IF([test "$IS_WIN32" = "yes"])
>>> diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in
>>> index 7812a94ee..6128f0cae 100644
>>> --- a/utilities/ovs-lib.in
>>> +++ b/utilities/ovs-lib.in
>>> @@ -441,18 +441,41 @@ create_db () {
>>>
>>> backup_db () {
>>> # Back up the old version.
>>> - version=`ovsdb_tool db-version "$DB_FILE"`
>>> - cksum=`ovsdb_tool db-cksum "$DB_FILE" | awk '{print $1}'`
>>> - backup=$DB_FILE.backup$version-$cksum
>>> + if test ! -e "$DB_FILE"; then
>>> + return 0
>>> + elif ovsdb_tool db-is-standalone "$DB_FILE" 2>/dev/null; then
>>> + version=`ovsdb_tool db-version "$DB_FILE"`
>>> + cksum=`ovsdb_tool db-cksum "$DB_FILE" | awk '{print $1}'`
>>> + backup=$DB_FILE.backup$version-$cksum
>>> + else
>>> + # Support for clsutered databases
I missed this one on a first read:
Period at the end of the sentence.
>>> + backup=`mktemp -q $DB_FILE.XXXXXXXXXX`
>>> + fi
>>> action "Backing up database to $backup" cp "$DB_FILE" "$backup" ||
>>> return 1
>>> }
>>>
>>> +validate_db () {
>>> + # Returns 0 if $DB_FILE is present and at least has a db name.
>>> + # Returns 1 if $DB_FILE is not present.
>>> + DB_FILE="$1"
>>> +
>>> + if test ! -e "$DB_FILE"; then
>>> + return 1
>>> + elif ! ovsdb_tool db-name "$DB_FILE" >/dev/null 2>&1; then
>>> + backup_db "$DB_FILE"
>>> + rm -f "$DB_FILE"
>>> + return 1
>>> + fi
>>> +
>>> + return 0
>>> +}
>>> +
>>> upgrade_db () {
>>> DB_FILE="$1"
>>> DB_SCHEMA="$2"
>>>
>>> schemaver=`ovsdb_tool schema-version "$DB_SCHEMA"`
>>> - if test ! -e "$DB_FILE"; then
>>> + if ! validate_db "$DB_FILE"; then
>>
>> IMO, such a check complicates the understanding, especially because
>> validation has side effects.
>>
>> Maybe it'll be be better to just call 'validate_db' in the beginning
>> of the function and then keep all the other checks as-is?
>>
>> The same for the cluster functions below.
>>
>> In that case it might be good to print something like "XX is not a
>> valid database file" from the validate_db as well before backing up.
>>
>> WDYT?
>
> I can implement these changes. Do you have a preference regarding the
> PATH issue?
Yeah, I looked through different implementations and it's all a mess.
I do not see any of the functions files on RHEL9 (maybe I'm missing the
package they are in, but I couldn't find one).
Fedora implements action(), but logging functions are named differently:
echo_success / echo_failure(). And those are called from the action().
It means we'll use different logging functions when calling action() and
when calling log_failure_msg(), for example.
On Ubuntu, we have logging functions, but do not have action().
The file in Fedora also overwrites PATH, umask and some terminal settings.
Ubuntu one doesn't do that, AFAICT.
It seems we're not really using anything beside pretty logging from these
lsb functions. So, maybe it's better if we just remove the sourcing and
use our own implementations? Should be easier to use and more predictable.
Maybe slightly less pretty, but I'm not sure if we care.
Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev