Re: [Mono-dev] Patch: Ternary ops in mini and general ATOMIC_CAS

2009-03-20 Thread Mark Probst
Hey Rodrigo,

Thanks for the suggestions!

 This change increases the JIT working set significantly. Have you thought
 about doing something like
 struct MonoCallInst?

Yes, I have, and I've started implementing it, but it requires quite a
bit of reworking of some of mini' passes, because we happen to change
the opcode of an inst now and then, which means that the number of
sregs can change, too.  And sometimes we write to the new sregs
before we even change the opcode.  After I figured that out I decided
to scrap the idea for the time being.

Mark
___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list


Re: [Mono-dev] Patch: Ternary ops in mini and general ATOMIC_CAS

2009-03-18 Thread Rodrigo Kumpera
I quite like this patch, it removes more code than it adds back. The results
should look clear
and if we end wishing optimizing a spot we can trade the loop for an inlined
function or macro. [1]

But there are a few things that I dislike about this patch.


--- a/mono/mini/cfold.c
+++ b/mono/mini/cfold.c
@@ -58,13 +58,18 @@ mono_is_power_of_two (guint32 val)
 res = (cast)arg1-inst_c0 op (cast)arg2-inst_c0;\
 break; \

+#define MONO_INST_NULLIFY_SREGS(dest) do {\
+(dest)-sreg1 = (dest)-sreg2 = (dest)-sreg3 = -1;\
+} while (0)
+
 #undef MONO_INST_NEW
 #define MONO_INST_NEW(cfg,dest,op) do {\
 (dest) = mono_mempool_alloc ((cfg)-mempool, sizeof (MonoInst));
\
 (dest)-inst_p0 = (dest)-inst_p1 = (dest)-next = NULL; \
 (dest)-opcode = (op);\
 (dest)-flags = 0; \
-(dest)-dreg = (dest)-sreg1 = (dest)-sreg2 = -1;  \
+(dest)-dreg = -1;\
+MONO_INST_NULLIFY_SREGS ((dest));\
 } while (0)

This is pure code duplication, kill it instead of patching.

@@ -138,6 +142,15 @@ ins_info[] = {
 #include mini-ops.h
 };
 #undef MINI_OP
+#undef MINI_OP3
+
+#define MINI_OP(a,b,dest,src1,src2) -1,
+#define MINI_OP3(a,b,dest,src1,src2,src3) -1,
+gint8 ins_sreg_counts[] = {
+#include mini-ops.h
+};
+#undef MINI_OP
+#undef MINI_OP3


This information can be calculated at compile time. All we get from running
mini_init_op_sreg_counts
is an increased private RSS with no advantage. This information should be
packaged in the ins_info array.


@@ -437,7 +450,7 @@ struct MonoInst {
 guint8  flags  : 5;

 /* used by the register allocator */
-gint32 dreg, sreg1, sreg2;
+gint32 dreg, sreg1, sreg2, sreg3;


This change increases the JIT working set significantly. Have you thought
about doing something like
struct MonoCallInst?

It would allow us to have 3 sregs and only pay the size price when needed.
It would be trivial to have
a function to detect such instructions by expanding mini-ops.h to avoid
hitting the ins_info array.

Besides that, I'm eager to have this patch in so I can exploit it on
Mono.Simd :)


[1] Doing something like like:
for (i = 0; i  sreg_count (ins); ++i)
  z

to

static inline foo() { z } //OR

#define FOO() do { z; } while (0)

if (has_sreg1)
  foo ();
if (has_sreg2)
  foo ();
if (unlikely (has_sreg3)
 foo ();



2009/3/13 Mark Probst mark.pro...@gmail.com

 On Mon, Feb 23, 2009 at 10:10 PM, Zoltan Varga var...@gmail.com wrote:
  All this to add support for exactly one rarely used instruction, CAS.

 As per Paolo's request I timed a --compile-all of
 System.Windows.Forms.dll (which I chose instead of mscorlib.dll
 because it's larger) with and without the patch.  It turns out with
 the patch we're about 10% slower.  With the attached revised patch,
 which uses macros for all the getter functions, the difference is
 barely 3% (2.3166s vs 2.2522s, on x86).

 Mark

 ___
 Mono-devel-list mailing list
 Mono-devel-list@lists.ximian.com
 http://lists.ximian.com/mailman/listinfo/mono-devel-list


___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list


Re: [Mono-dev] Patch: Ternary ops in mini and general ATOMIC_CAS

2009-02-23 Thread Zoltan Varga
Hi,

  My problems with this patch, and with adding support for ternary ops
in general:
- it increases the size of MonoInst by 4/8 bytes.
- it slows down every phase of the JIT by changing linear code into loops,
ie. instead of
process sreg1
process sreg2
it is now a loop with an unknown upper bound.

All this to add support for exactly one rarely used instruction, CAS.

  Zoltan

On Mon, Feb 23, 2009 at 8:26 PM, Mark Probst mark.pro...@gmail.com wrote:
 Hey,

 These patches implement ternary ops in mini and a general ATOMIC_CAS
 (for x86, AMD64 and PPC(64)).

 Please review, especially my register allocator changes.  The global
 register allocator is not adapted, yet.

 Mark


___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list