Answering your first question: running tpcc for 1 minute, in a database with 64 warehouses (6~7GB), with a buffer pool of 128MB (around 1.8% of database size) and a hit ratio of ~91%, I get a throughput of 45~50 transactions per second.
I did some experiments and I got the following information about my tpcc database and benchmark. The database is created with 64 warehouses. Table | Index | Data Size | Index Size | Total ------------+--------------------------------------+-----------+------------+--------- stock | stock_pkey | 2209 MB | 263 MB | 2472 MB order_line | order_line_pkey | 2041 MB | 678 MB | 2719 MB customer | idx_customer_name | 1216 MB | 146 MB | 1420 MB customer | customer_pkey | 1216 MB | 58 MB | 1420 MB history | | 164 MB | | 164 MB oorder | oorder_pkey | 134 MB | 68 MB | 362 MB oorder | idx_order | 134 MB | 80 MB | 362 MB oorder | oorder_o_w_id_o_d_id_o_c_id_o_id_key | 134 MB | 80 MB | 362 MB new_order | new_order_pkey | 27 MB | 17 MB | 45 MB item | item_pkey | 10168 kB | 2208 kB | 12 MB district | district_pkey | 776 kB | 72 kB | 880 kB warehouse | warehouse_pkey | 384 kB | 16 kB | 432 kB By executing the tpcc benchmark for 1 minute I get about 2.9 million buffer requests. The distribution of these requests in the relations and indexes are (in descending order): customer 1383399 stock_pkey 442600 stock 321370 order_line 255314 order_line_pkey 156132 oorder 58665 oorder_pkey 57895 customer_pkey 44471 new_order_pkey 39552 idx_customer_name 28286 new_order 25861 item_pkey 11702 item 11606 district 11389 district_pkey 7575 warehouse 5276 idx_order 4072 oorder_o_w_id_o_d_id_o_c_id_o_id_key 2410 warehouse_pkey 1998 history 1958 All this information seems normal to me. However, from the 2.9 million buffer requests over ~800k pages, only ~150k distinct pages are being requested. This behavior could be explained by the benchmark accessing only a small set of the 64 warehouses instead of having a normal distributed access over the 64 warehouses. In other words, I think that the execution time of the benchmark is irrelevant, assuming that the transactions follow a normal distribution regarding accesses to warehouses. On Wed, Oct 15, 2014 at 7:41 PM, Jeff Janes <jeff.ja...@gmail.com> wrote: > On Wed, Oct 15, 2014 at 6:22 AM, Lucas Lersch <lucasler...@gmail.com> > wrote: > >> So is it a possible normal behavior that running tpcc for 10min only >> access 50% of the database? Furthermore, is there a guideline of parameters >> for tpcc (# of warehouses, execution time, operations weight)? >> >> > I'm not familiar with your benchmarking tool. With the one I am most > familiar with, pgbench, if you run it against a database which is too big > to fit in memory, it can take a very long time to touch each page once, > because the constant random disk reads makes it run very slowly. Maybe > that is something to consider here--how many transactions were actually > executed during your 10 min run? > > Also, the tool might build tables that are only used under certain run > options. Perhaps you just aren't choosing the options which invoke usage > of those tables. Since you have the trace data, it should be pretty easy > to count how many distinct blocks are accessed from each relation, and > compare that to the size of the relations to see which relations are unused > or lightly used. > > Cheers, > > Jeff > -- Lucas Lersch