On 20 June 2012 17:41, Tom Lane <[email protected]> wrote:
> Peter Geoghegan <[email protected]> writes:
>> No, I'm suggesting it would probably be at least a bit of a win here
>> to cache the constant, and only have to do a strxfrm() + strcmp() per
>> comparison.
>
> Um, have you got any hard evidence to support that notion? The
> traditional advice is that strcoll is faster than using strxfrm unless
> the same strings are to be compared repeatedly. I'm not convinced that
> saving the strxfrm on just one side will swing the balance.
I've written a very small C++ program, which I've attached, that
basically proves that this can still make a fairly large difference -
I hope it's okay that that it's C++, but that allowed me to write the
program quickly, with no dependencies for anyone who cares to try this
out, other than a C++ compiler and the standard library.
It just inserts 500,000 elements (the same succession of psuedo-random
strings again and again) into a std::set, which is implemented using a
red-black tree on my machine. In one case, we just use strcmp() (there
is actually no tie-breaker). In the other case, the comparator
allocates and fills memory for a strxfrm blob when needed, caches it,
and uses strxfrm() to generate blobs for other elements into a
dedicated buffer which persists across comparisons.
It prints the duration of each run, and a running average for each
case until terminated. Here is what the output looks like on my
machine:
[peter@peterlaptop strxfrm_test]$ g++ -O2 set_test.cpp
[peter@peterlaptop strxfrm_test]$ ./a.out
Time elapsed with strcoll: 1.485290 seconds
Time elapsed with strxfrm optimization: 1.070211 seconds
Time elapsed with strcoll: 1.813725 seconds
Time elapsed with strxfrm optimization: 1.097950 seconds
Time elapsed with strcoll: 1.813381 seconds
Time elapsed with strxfrm optimization: 1.102769 seconds
Time elapsed with strcoll: 1.826708 seconds
Time elapsed with strxfrm optimization: 1.105093 seconds
Time elapsed with strcoll: 1.842156 seconds
Time elapsed with strxfrm optimization: 1.103198 seconds
*****strcoll average: 1.756252 strxfrm average: 1.095844*****
Time elapsed with strcoll: 1.834785 seconds
Time elapsed with strxfrm optimization: 1.105369 seconds
Time elapsed with strcoll: 1.817449 seconds
Time elapsed with strxfrm optimization: 1.110386 seconds
Time elapsed with strcoll: 1.812446 seconds
Time elapsed with strxfrm optimization: 1.101266 seconds
Time elapsed with strcoll: 1.813595 seconds
Time elapsed with strxfrm optimization: 1.099444 seconds
Time elapsed with strcoll: 1.814584 seconds
Time elapsed with strxfrm optimization: 1.099542 seconds
*****strcoll average: 1.787412 strxfrm average: 1.099523*****
Time elapsed with strcoll: 1.820218 seconds
Time elapsed with strxfrm optimization: 1.102984 seconds
Time elapsed with strcoll: 1.817549 seconds
Time elapsed with strxfrm optimization: 1.100526 seconds
Time elapsed with strcoll: 1.817020 seconds
Time elapsed with strxfrm optimization: 1.099273 seconds
Time elapsed with strcoll: 1.820983 seconds
Time elapsed with strxfrm optimization: 1.100501 seconds
Time elapsed with strcoll: 1.813270 seconds
Time elapsed with strxfrm optimization: 1.098904 seconds
*****strcoll average: 1.797544 strxfrm average: 1.099828*****
Time elapsed with strcoll: 1.815952 seconds
Time elapsed with strxfrm optimization: 1.102583 seconds
If you'd like to print the contents of the set, there are a couple of
lines you can uncomment. It should be straightforward to understand
the program, even if you don't know any C++.
Note that I still think that the compelling case for strxfrm() caching
is sorting tuples, not btree index traversal. All that I ask is that
you allow that on the whole, this will pay for the cost of verifying
equivalency within _bt_checkkeys() once per index-scan.
I am aware that a red-black tree isn't generally an excellent proxy
for how a btree will perform, but I don't think that that really
matters here.
--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services
/*
* set_test.cpp:
*
* Measure the performance characteristics of using strcoll() as
* a string comparator rather than caching strxfrm() blobs.
*
* Author: Peter Geoghegan
*
* A std::set is an ascending container of unique elements. The implementation
* that I used for any numbers published was the one from Gnu libstdc++, on
* Fedora 16. That particular implementation uses a red-black tree, but the
* standard's requirement that common operations (search, insert, delete) take
* logarithmic time effectively requires the use of some kind of self-balancing
* binary search tree, which isn't a bad proxy for a btree for my purposes.
*/
#include <set>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <string>
#include <sys/time.h>
#define ORIG_STRING_SIZE 32
#define BLOB_STRING_SIZE 128
double
timeval_subtract(struct timeval *x, struct timeval *y)
{
struct timeval result;
/* Compute the time remaining to wait. tv_usec is certainly positive. */
result.tv_sec = x->tv_sec - y->tv_sec;
result.tv_usec = x->tv_usec - y->tv_usec;
/* return difference in seconds */
return result.tv_sec + ((double) result.tv_usec / 1000000);
}
void
reset_prng_seed()
{
srand(0);
}
/*
* Enter a psuedo-random sequence of characters into a buffer specified by buf
*
* This just places lower-case ascii characters into the buffer, so strings
* generated by this function should be relatively cheap to do a
* collation-aware sort with when LC_COLLATE is set to en_*.UTF-8. I've made a
* point of using en_US.UTF-8 in benchmarks.
*/
void
place_random_string(char *buf)
{
static std::string charset = "abcdefghijklmnopqrstuvwxyz";
for (int i = 0; i < ORIG_STRING_SIZE; i++)
buf[i] = charset[rand() % charset.length()];
}
class string_wrapper
{
public:
string_wrapper(bool use_strcoll);
string_wrapper(const string_wrapper &rhs);
virtual ~string_wrapper();
// operator< is our comparator, required by std::set
bool operator<(const string_wrapper &rhs) const;
// If you want to see the strings contents printed to stdout, call this
// function:
void print_contents() const;
private:
// Don't provide an asignment operator
const string_wrapper& operator=(const string_wrapper& rhs);
// mark these "mutable" so they can be modified in our comparator as part
// of the required lazy initialization (std::set in turn requires that it
// is a const member function). Not very idiomatic, but the easiest way to
// approximate how this might work in Postgres.
mutable char *orig_string;
mutable char *strxfrm_buf;
mutable char *strxfrm_rhs_buf;
bool use_strcoll;
};
typedef std::set<string_wrapper> string_set;
/*
* Default constructor
*/
string_wrapper::string_wrapper(bool use_strcoll):
use_strcoll(use_strcoll),
orig_string(new char[ORIG_STRING_SIZE]),
strxfrm_buf(NULL),
strxfrm_rhs_buf(NULL)
{
// Generate a random string in a buffer. This class conceptually manages
// that resource.
place_random_string(orig_string);
}
/*
* Copy constructor: Make a deep copy of the class
*/
string_wrapper::string_wrapper(const string_wrapper &rhs):
use_strcoll(rhs.use_strcoll),
orig_string(new char[ORIG_STRING_SIZE]),
strxfrm_buf(NULL),
strxfrm_rhs_buf(NULL)
{
// Only copy the original string
strcpy(orig_string, rhs.orig_string);
}
/*
* Boilerplate destructor
*/
string_wrapper::~string_wrapper()
{
delete [] orig_string;
if (strxfrm_buf)
{
delete [] strxfrm_buf;
delete [] strxfrm_rhs_buf;
}
}
/*
* Comparator required by std::set
*/
bool string_wrapper::operator<(const string_wrapper &rhs) const
{
if (use_strcoll)
{
/*
* I might have simulated the extra strcmp() tie-break overhead here,
* but that interface isn't supported.
*/
return strcoll(orig_string, rhs.orig_string) < 0;
}
else
{
// Do we already have a buffer allocated? We lazily allocated this on
// the first time through.
if (!strxfrm_buf)
{
strxfrm_buf = new char [BLOB_STRING_SIZE];
strxfrm_rhs_buf = new char [BLOB_STRING_SIZE];
// Set up our cached blob
strxfrm(strxfrm_buf, orig_string, BLOB_STRING_SIZE);
}
// Create temporary blob for rhs element as part of tree traversal
strxfrm(strxfrm_rhs_buf, rhs.orig_string, BLOB_STRING_SIZE);
return strcmp(strxfrm_buf, strxfrm_rhs_buf) < 0;
}
}
void
string_wrapper::print_contents() const
{
printf("String contents: %s\n", orig_string);
}
void
print_set_contents(const string_set &rhs)
{
// The map is sorted and iterable
for (string_set::const_iterator i = rhs.begin();
i != rhs.end(); ++i)
{
i->print_contents();
}
}
#define NUM_ELEMS 500000
int
main(int argc, const char *argv[])
{
double non_opt_tot = 0, opt_tot = 0;
int runs = 0;
for(;;)
{
struct timeval begin, end;
double secs_result;
// Just so there's no ambiguity about what memory is allocated when,
// allocate set dynamically:
string_set *non_opt = new string_set;
/* simple strcoll run */
gettimeofday(&begin, NULL);
for (int i = 0; i < NUM_ELEMS; ++i)
{
string_wrapper wrap(false);
non_opt->insert(wrap);
}
gettimeofday(&end, NULL);
secs_result = timeval_subtract(&end, &begin);
printf("Time elapsed with strcoll: %f seconds\n", secs_result);
non_opt_tot += secs_result;
// You may wish to uncomment this to see the sorted elements printed
//
//print_set_contents(*non_opt);
// This delete statement calls the set's destructor, which is turn calls
// each string's destructor, freeing all resources used.
delete non_opt;
// Actual string values should be completely deterministic for both sets of behaviours
reset_prng_seed();
string_set *opt = new string_set;
/* strxfrm run */
gettimeofday(&begin, NULL);
for (int i = 0; i < NUM_ELEMS; ++i)
{
string_wrapper wrap(true);
opt->insert(wrap);
}
gettimeofday(&end, NULL);
secs_result = timeval_subtract(&end, &begin);
printf("Time elapsed with strxfrm optimization: %f seconds\n", secs_result);
opt_tot += secs_result;
// You may wish to uncomment this to see the sorted elements printed
//
//print_set_contents(*opt);
delete opt;
++runs;
if (runs % 5 ==0)
printf("*****strcoll average: %f strxfrm average: %f*****\n", non_opt_tot / runs, opt_tot / runs);
}
return 0;
}
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers