Re: [HACKERS] [GSOC] [Weekly report 2] Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions

2017-06-15 Thread Heikki Linnakangas

On 06/15/2017 08:51 AM, Mengxing Liu wrote:

My design is as follow:

For hash table, key is the pointer of SerializableXact; Value is the 
RWConflictData object.
Hashcode is generated based on the SerializableXact pointer.
So, given a SerializableXact, we can quickly find if it is conflict with 
another SerializableXact.

Every SerializableXact has two tables: inConflictsTab and outConflictsTab.
They are allocated and initialized when creating PredXactList (in the function 
InitPredicateLocks).
When a SerializableXact object is released, it will release all its RWConflict 
objects, the hash tables are empty again.
Thus They can be reused directly after releasing.

NOTE: I stored RWConflictData in hash tables, instead of RWConflict object 
allocated by RWConflictPool.
 After I remove other linked lists, the RWConflictPool can be omitted.


Sounds good!


My code is on the :
https://github.com/liumx10/postgresql/commit/3fd9a7488de5ae19ce2ce19eae5f303153a079ff


Once you've ironed out the obvious bugs, make sure to also post it as a 
patch to this mailing list. For the sake of the archives, and to make it 
clear that you're submitting this for inclusion in PostgreSQL, under the 
PostgreSQL license.


Couple of little things caught my eye at a quick glance:


 * Test whether a hash table is empty.
+ * I didn't find any function in dynamic hash supports the requirement.


You can do: hash_get_num_entries(hashp) == 0


sprintf(name, "PredXact inConflictsTab %d", i);


Better to use snprintf() instead. sprintf() is safe here, but many 
static analysis tools will complain on any sight of plain sprintf(), so 
better to just never use it.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [GSOC] [Weekly report 2] Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions

2017-06-14 Thread Mengxing Liu
 Hi, all. In the last week, I replaced linked list with hash table in 
SerializableXact. 
 I only replace inConflicts and outConflicts. The other linked lists, such as 
possibleUnsafeConflicts, I will modify them after other things work well.

There are still some bugs: the abort rate is much higher than before (from 
11.3% to 72%). I'll figure out what's wrong in the next few days.

My design is as follow:

For hash table, key is the pointer of SerializableXact; Value is the 
RWConflictData object. 
Hashcode is generated based on the SerializableXact pointer. 
So, given a SerializableXact, we can quickly find if it is conflict with 
another SerializableXact.

Every SerializableXact has two tables: inConflictsTab and outConflictsTab. 
They are allocated and initialized when creating PredXactList (in the function 
InitPredicateLocks). 
When a SerializableXact object is released, it will release all its RWConflict 
objects, the hash tables are empty again. 
Thus They can be reused directly after releasing. 

NOTE: I stored RWConflictData in hash tables, instead of RWConflict object 
allocated by RWConflictPool. 
 After I remove other linked lists, the RWConflictPool can be omitted.

My code is on the :
https://github.com/liumx10/postgresql/commit/3fd9a7488de5ae19ce2ce19eae5f303153a079ff

There are 3 main modifications in summary:
1) Initializing hash table. Related functions: InitPredicateLocks

2) Setting, releasing and checking RWConflicts. Related functions: 
RWConflictExists, SetRWConflict, ReleaseRWConflict

3) There are some functions scanning all RWConflict in a SerializableXact.
Related functions: ReleasePredicateLocks, ReleaseOneSerializableXact, 
CheckForSerializableConflictOut, 
 CheckForSerializableConflictIn, OnConflict_CheckForSerializationFailure, 
PreCommit_CheckForSerializationFailure




Sincerely
--
Mengxing Liu