Fixed some typos in attached v5-0002 patch. Please consider this patch for review.
On Fri, Dec 20, 2019 at 6:54 PM Suraj Kharage < suraj.khar...@enterprisedb.com> wrote: > Thank you for review comments. > > On Thu, Dec 19, 2019 at 2:54 AM Robert Haas <robertmh...@gmail.com> wrote: > >> On Tue, Dec 17, 2019 at 12:54 AM Suraj Kharage >> <suraj.khar...@enterprisedb.com> wrote: >> > I have implemented the simplehash in backup validator patch as Robert >> suggested. Please find attached 0002 patch for the same. >> > >> > kindly review and let me know your thoughts. >> >> +#define CHECKSUM_LENGTH 256 >> >> This seems wrong. Not all checksums are the same length, and none of >> the ones we're using are 256 bytes in length, and if we've got to have >> a constant someplace for the maximum checksum length, it should >> probably be in the new header file, not here. But I don't think we >> should need this in the first place; see comments below about how to >> revise the parsing of the manifest file. >> > > I agree. Removed. > > + char filetype[10]; >> >> A mysterious 10-byte field with no comments explaining what it >> means... and the same magic number 10 appears in at least one other >> place in the patch. >> > > with current logic, we don't need this anymore. > I have removed the filetype from the structure as we are not doing any > comparison anywhere. > > >> >> +typedef struct manifesthash_hash *hashtab; >> >> This declares a new *type* called hashtab, not a variable called >> hashtab. The new type is not used anywhere, but later, you have >> several variables of the same type that have this name. Just remove >> this: it's wrong and unused. >> >> > corrected. > > >> +static enum ChecksumAlgorithm checksum_type = MC_NONE; >> >> Remove "enum". Not needed, because you have a typedef for it in the >> header, and not per style. >> >> corrected. > > >> +static manifesthash_hash *create_manifest_hash(char >> manifest_path[MAXPGPATH]); >> >> Whitespace is wrong. The whole patch needs a visit from pgindent with >> a properly-updated typedefs.list. >> >> Also, you will struggle to find anywhere else in the code base where >> pass a character array as a function argument. I don't know why this >> isn't just char *. >> > > Corrected. > > >> >> + if(verify_backup) >> >> Whitespace wrong here, too. >> >> > Fixed > > >> >> It's also pretty unhelpful, here and elsewhere, to refer to "the hash >> table" as if there were only one, and as if the reader were supposed >> to know something about it when you haven't told them anything about >> it. >> >> + if (!entry->matched) >> + { >> + pg_log_info("missing file: %s", entry->filename); >> + } >> + >> >> The braces here are not project style. We usually omit braces when >> only a single line of code is present. >> > > fixed > > >> >> I think some work needs to be done to standardize and improve the >> messages that get produced here. You have: >> >> 1. missing file: %s >> 2. duplicate file present: %s >> 3. size changed for file: %s, original size: %d, current size: %zu >> 4. checksum difference for file: %s >> 5. extra file found: %s >> >> I suggest: >> >> 1. file \"%s\" is present in manifest but missing from the backup >> 2. file \"%s\" has multiple manifest entries >> (this one should probably be pg_log_error(), not pg_log_info(), as it >> represents a corrupt-manifest problem) >> 3. file \"%s" has size %lu in manifest but size %lu in backup >> 4. file \"%s" has checksum %s in manifest but checksum %s in backup >> 5. file \"%s" is present in backup but not in manifest >> > > Corrected. > > >> >> Your patch actually doesn't compile on my system, because for the >> third message above, it uses %zu to print the size. But %zu is for >> size_t, not off_t. I went looking for other places in the code where >> we print off_t; based on that, I think the right thing to do is to >> print it using %lu and write (unsigned long) st.st_size. >> > > Corrected. > > + char file_checksum[256]; >> + char header[1024]; >> >> More arbitrary constants. > > > >> >> + if (!file) >> + { >> + pg_log_error("could not open backup_manifest"); >> >> That's bad error reporting. See e.g. readfile() in initdb.c. >> > > Corrected. > > >> >> + if (fscanf(file, "%1023[^\n]\n", header) != 1) >> + { >> + pg_log_error("error while reading the header from >> backup_manifest"); >> >> That's also bad error reporting. It is only a slight step up from >> "ERROR: error". >> >> And we have another magic number (1023). >> > > With current logic, we don't need this anymore. > > >> >> + appendPQExpBufferStr(manifest, header); >> + appendPQExpBufferStr(manifest, "\n"); >> ... >> + appendPQExpBuffer(manifest, "File\t%s\t%d\t%s\t%s\n", filename, >> + filesize, mtime, checksum_with_type); >> >> This whole thing seems completely crazy to me. Basically, you're >> trying to use fscanf() to parse the file. But then, because fscanf() >> doesn't give you the original bytes back, you're trying to reassemble >> the data that you parsed to recover the original line, so that you can >> stuff it in the buffer and eventually checksum it. However, that's >> highly error-prone. You're basically duplicating server code, and thus >> risking getting out of sync in the server code, to work around a >> problem that is entirely self-inflicted, namely, deciding to use >> fscanf(). >> >> What I would recommend is: >> >> 1. Use open(), read(), close() rather than the fopen() family of >> functions. As we have discovered elsewhere, fread() doesn't promise to >> set errno, so we can't necessarily get reliable error-reporting out of >> it. >> >> 2. Before you start reading the file, create a buffer that's large >> enough to hold the whole thing, by using fstat() to figure out how big >> the file is. Read the whole file into that buffer. If you're not able >> to read the whole file -- i.e. open() or read() or close() fail -- >> then just error out and exit. >> >> 3. Now advance through the file line by line. Write a function that >> knows how to search forward for the next \r or \n but with checks to >> make sure it can't run off the end of the buffer, and use that to >> locate the end of each line so that you can walk forward. As you walk >> forward line by line, add the line you just processed to the checksum. >> That way, you only need a single pass over the data. Also, you can >> modify it in place. More on that below. >> >> 4. As you examine each line, start by examining the first word. You'll >> need a function that finds the first word by searching forward for a >> tab character, but not beyond the end of the line. The first word of >> the first line should be PostgreSQL-Backup-Manifest-Version and the >> second word should be 1. Then on each subsequent line check whether >> the first word is File or Manifest-Checksum or something else, >> erroring out in the last case. If it's Manifest-Checksum, verify that >> this is the last line of the file and that the checksum matches. If >> it's File, break the line into fields so you can add it to the hash >> table. You'll want a pointer to the filename and a pointer to the >> checksum, and you'll want to parse the size as an integer. Instead of >> allocating new memory for those fields, just overwrite the character >> that follows the field with a \0. There must be one - either \t or \n >> - so you shouldn't run off the end of the buffer. >> >> If you do this, a bunch of the fixed-size buffers you have right now >> go away. You don't need the variable filetype[10] any more, or >> checksum_with_type[CHECKSUM_LENGTH], or checksum[CHECKSUM_LENGTH], or >> the character arrays inside DataDirectoryFileInfo. Instead you can >> just have pointers into the buffer that contains the file. And you >> don't need this code to back up using fseek() and reread the lines, >> either. >> >> > Thanks for the suggestion. I tried to mimic your approach in the attached > v4-0002 patch. > Please let me know your thoughts on the same. > > Also read this article: >> >> https://stackoverflow.com/questions/2430303/disadvantages-of-scanf >> >> Note that the very first point in the article talks about the problem >> of overrunning the buffer, which you certainly have in the current >> code right here: >> >> + if (fscanf(file, "%s\t%s\t%d\t%23[^\t] %s\n", filetype, filename, >> >> filetype is declared as char[10], but %s could read arbitrarily much data. >> > > now with this revised logic, we don't use this anymore. > > >> >> + filename = (char*) pg_malloc(MAXPGPATH); >> >> pg_malloc returns void *, so no cast is required. >> >> > fixed. > > >> + if (strcmp(checksum_with_type, "-") == 0) >> + { >> + checksum_type = MC_NONE; >> + } >> + else >> + { >> + if (strncmp(checksum_with_type, "SHA256", 6) == 0) >> >> Use parse_checksum_algorithm. Right now you've invented a "common" >> function with 1 caller, but I explicitly suggested previously that you >> put it in common so that you could reuse it. >> > > while parsing the record, we get <checktype>:<checksum> as a string for > checksum. > parse_checksum_algorithm uses pg_strcasecmp() so we need to pass exact > string to that function. > with current logic, we can't add '\0' in between the line unless we parse > it completely. > So we may need to allocate another small buffer and copy only checksum > type in that and pass that to > parse_checksum_algorithm. I don't think of any other solution apart from > this. I might be missing something > here, please correct me if I am wrong. > > >> + if (strcmp(de->d_name, ".") == 0 || strcmp(de->d_name, "..") == >> 0 || >> + strcmp(de->d_name, "pg_wal") == 0) >> + continue; >> >> Ignoring pg_wal at the top level might be OK, but this will ignore a >> pg_wal entry anywhere in the directory tree. >> >> + /* Skip backup manifest file. */ >> + if (strcmp(de->d_name, "backup_manifest") == 0) >> + return; >> >> Same problem. >> > > You are right. Added extra check for this. > > >> >> + filename = createPQExpBuffer(); >> + if (!filename) >> + { >> + pg_log_error("out of memory"); >> + exit(1); >> + } >> + >> + appendPQExpBuffer(filename, "%s%s", relative_path, de->d_name); >> >> Just use char filename[MAXPGPATH] and snprintf here, as you do >> elsewhere. It will be simpler and save memory. >> > Fixed. > > TAP test case patch needs some modification, Will do that and submit. > > -- > -- > > Thanks & Regards, > Suraj kharage, > EnterpriseDB Corporation, > The Postgres Database Company. > -- -- Thanks & Regards, Suraj kharage, EnterpriseDB Corporation, The Postgres Database Company.
v5-0001-Backup-manifest-with-file-names-sizes-timestamps-.patch
Description: Binary data
v5-0002-Implementation-of-backup-validator.patch
Description: Binary data