Re: [patch] libitm: Add multi-lock, write-through TM method.

2012-02-14 Thread Torvald Riegel
On Mon, 2012-02-13 at 14:50 -0800, Richard Henderson wrote:
 On 02/13/2012 01:47 PM, Torvald Riegel wrote:
  +  // Location-to-orec mapping.  Stripes of 16B mapped to 2^19 orecs.
  +  static const gtm_word L2O_ORECS = 1  19;
  +  static const gtm_word L2O_SHIFT = 4;
 
 Is it just easier to say 16B or did we really want CACHELINE_SIZE?

The 16B stripes where quite good in past experiments with TinySTM, IIRC.
The problem with the simple hash function that we use currently is that
if you make #shifts too small, you acquire and check more locks (so
higher cache footprint too), and the lock array covers less space before
it wraps around.  However, if you make #shifts too large, you increase
false sharing.  Cacheline granularity could be too large already, unless
the user has put all independent variables on separate cache lines (but
this is often too much bloat, e.g., think about the layout of the nodes
of a tree...). The best choice here is more or less workload-dependent,
and still an open question.  In the past, I also experimented with less
simple hash functions, with mixed results.
Optimizing / tuning this hash function is on my performance todo list.

 Otherwise ok.

Committed (with the formatting fixed, and the 16B stripes kept as
before).



[patch] libitm: Add multi-lock, write-through TM method.

2012-02-13 Thread Torvald Riegel
This patch adds a new TM method, ml_wt, which uses an array of locks
with version numbers and runs a write-through algorithm with time-based
validations and snapshot time extensions.

patch1 adds xcalloc as a helper function for allocations (used in the
new TM method).

patch2 improves TM method reinitialization (helps ml_wt avoid
reallocation of the lock array) and adds a hook to TM methods so that
they can report back whether they can deal with the current runtime
situation (e.g., a the current number of threads).

patch3 is the actual TM method.

Tested on ppc64 with up to 64 threads with both microbenchmarks and
STAMP.  OK for trunk?
commit c0d1d1778b18f3dfc4a136e5a807c2fecbeb64e4
Author: Torvald Riegel trie...@redhat.com
Date:   Thu Feb 9 13:44:38 2012 +0100

libitm: Add xcalloc.

libitm/
* util.cc (GTM::xcalloc): New.
* common.h (GTM::xcalloc): Declare.

diff --git a/libitm/common.h b/libitm/common.h
index 14d0efb..b1ef2d4 100644
--- a/libitm/common.h
+++ b/libitm/common.h
@@ -54,6 +54,8 @@ namespace GTM HIDDEN {
 // cache lines that are not shared with any object used by another thread.
 extern void * xmalloc (size_t s, bool separate_cl = false)
   __attribute__((malloc, nothrow));
+extern void * xcalloc (size_t s, bool separate_cl = false)
+  __attribute__((malloc, nothrow));
 extern void * xrealloc (void *p, size_t s, bool separate_cl = false)
   __attribute__((malloc, nothrow));
 
diff --git a/libitm/util.cc b/libitm/util.cc
index afd37e4..48a1bf8 100644
--- a/libitm/util.cc
+++ b/libitm/util.cc
@@ -71,6 +71,18 @@ xmalloc (size_t size, bool separate_cl)
 }
 
 void *
+xcalloc (size_t size, bool separate_cl)
+{
+  // TODO Use posix_memalign if separate_cl is true, or some other allocation
+  // method that will avoid sharing cache lines with data used by other
+  // threads.
+  void *r = calloc (1, size);
+  if (r == 0)
+GTM_fatal (Out of memory allocating %lu bytes, (unsigned long) size);
+  return r;
+}
+
+void *
 xrealloc (void *old, size_t size, bool separate_cl)
 {
   // TODO Use posix_memalign if separate_cl is true, or some other allocation
commit 3b486db323b51ea87e1f64cd3abb9402f7c7307a
Author: Torvald Riegel trie...@redhat.com
Date:   Thu Feb 9 13:50:10 2012 +0100

libitm: Improve method reinit and choice.

libitm/
* dispatch.h (GTM::abi_dispatch::supports): New.
(GTM::method_group::reinit): New.
* retry.cc (GTM::gtm_thread::decide_retry_strategy): Use reinit().
(GTM::gtm_thread::number_of_threads_changed): Check that the method
supports the current situation.

diff --git a/libitm/dispatch.h b/libitm/dispatch.h
index dbf05e4..d059c49 100644
--- a/libitm/dispatch.h
+++ b/libitm/dispatch.h
@@ -245,6 +245,12 @@ struct method_group
   // Stop using any method from this group for now. This can be used to
   // destruct meta data as soon as this method group is not used anymore.
   virtual void fini() = 0;
+  // This can be overriden to implement more light-weight re-initialization.
+  virtual void reinit()
+  {
+fini();
+init();
+  }
 };
 
 
@@ -290,6 +296,10 @@ public:
   // method on begin of a nested transaction without committing or restarting
   // the parent method.
   virtual abi_dispatch* closed_nesting_alternative() { return 0; }
+  // Returns true iff this method group supports the current situation.
+  // NUMBER_OF_THREADS is the current number of threads that might execute
+  // transactions.
+  virtual bool supports(unsigned number_of_threads) { return true; }
 
   bool read_only () const { return m_read_only; }
   bool write_through() const { return m_write_through; }
diff --git a/libitm/retry.cc b/libitm/retry.cc
index decd773..6e05f5f 100644
--- a/libitm/retry.cc
+++ b/libitm/retry.cc
@@ -58,11 +58,8 @@ GTM::gtm_thread::decide_retry_strategy (gtm_restart_reason r)
  serial_lock.read_unlock(this);
  serial_lock.write_lock();
  if (disp-get_method_group() == default_dispatch-get_method_group())
-   {
- // Still the same method group.
- disp-get_method_group()-fini();
- disp-get_method_group()-init();
-   }
+   // Still the same method group.
+   disp-get_method_group()-reinit();
  serial_lock.write_unlock();
  serial_lock.read_lock(this);
  if (disp-get_method_group() != default_dispatch-get_method_group())
@@ -72,11 +69,8 @@ GTM::gtm_thread::decide_retry_strategy (gtm_restart_reason r)
}
}
   else
-   {
- // We are a serial transaction already, which makes things simple.
- disp-get_method_group()-fini();
- disp-get_method_group()-init();
-   }
+   // We are a serial transaction already, which makes things simple.
+   disp-get_method_group()-reinit();
 }
 
   bool retry_irr = (r == RESTART_SERIAL_IRR);
@@ -249,7 +243,7 @@ GTM::gtm_thread::number_of_threads_changed(unsigned 
previous, unsigned now)

Re: [patch] libitm: Add multi-lock, write-through TM method.

2012-02-13 Thread Richard Henderson
On 02/13/2012 01:47 PM, Torvald Riegel wrote:
 +  else {

Watch the formatting.

 +  // Location-to-orec mapping.  Stripes of 16B mapped to 2^19 orecs.
 +  static const gtm_word L2O_ORECS = 1  19;
 +  static const gtm_word L2O_SHIFT = 4;

Is it just easier to say 16B or did we really want CACHELINE_SIZE?

Otherwise ok.


r~