Package: release.debian.org
Severity: normal
User: release.debian....@packages.debian.org
Usertags: unblock

Hello release team,

Please unblock package mlucas.

This upload should fix the RC bug
<http://bugs.debian.org/cgi-bin/860662> by splitting big test into
smaller ones.

The diff attached below:

diff -Nru mlucas-14.1/debian/changelog mlucas-14.1/debian/changelog
--- mlucas-14.1/debian/changelog	2015-08-27 22:42:36.000000000 +0800
+++ mlucas-14.1/debian/changelog	2017-04-24 16:16:28.000000000 +0800
@@ -1,3 +1,11 @@
+mlucas (14.1-2) unstable; urgency=medium
+
+  * RC bug fix release (Closes: #860662), split big test into smaller ones
+    to avoid exhausting system resources.
+  * Backport fix for undefined behavior from upstream.
+
+ -- Alex Vong <alexvong1...@gmail.com>  Mon, 24 Apr 2017 16:16:28 +0800
+
 mlucas (14.1-1) unstable; urgency=low
 
   * Initial release (Closes: #786656)
diff -Nru mlucas-14.1/debian/patches/0001-fixes-undefined-behaviour.patch mlucas-14.1/debian/patches/0001-fixes-undefined-behaviour.patch
--- mlucas-14.1/debian/patches/0001-fixes-undefined-behaviour.patch	1970-01-01 08:00:00.000000000 +0800
+++ mlucas-14.1/debian/patches/0001-fixes-undefined-behaviour.patch	2017-04-24 16:16:28.000000000 +0800
@@ -0,0 +1,657 @@
+From f4c2fb2f7f771bf696d277140d267f6f03577f49 Mon Sep 17 00:00:00 2001
+From: Alex Vong <alexvong1...@gmail.com>
+Date: Wed, 27 Jul 2016 19:52:35 +0800
+Subject: [PATCH] Fixes undefined behaviour.
+
+Description: This fixes undefined behaviour (array out out bound) in
+ the fermat test code reported by gcc's
+ `-Waggressive-loop-optimizations'.
+Forwarded: yes
+Author: Ernst W. Mayer <ewma...@aol.com>
+
+* src/radix1008_main_carry_loop.h: Fix undefined behaviour.
+* src/radix1024_main_carry_loop.h: Likewise.
+* src/radix128_main_carry_loop.h: Likewise.
+* src/radix224_main_carry_loop.h: Likewise.
+* src/radix240_main_carry_loop.h: Likewise.
+* src/radix256_main_carry_loop.h: Likewise.
+* src/radix32_main_carry_loop.h: Likewise.
+* src/radix4032_main_carry_loop.h: Likewise.
+* src/radix56_main_carry_loop.h: Likewise.
+* src/radix60_main_carry_loop.h: Likewise.
+* src/radix64_main_carry_loop.h: Likewise.
+* src/radix960_main_carry_loop.h: Likewise.
+---
+ src/radix1008_main_carry_loop.h | 21 ++++++++++-----------
+ src/radix1024_main_carry_loop.h |  6 +++---
+ src/radix128_main_carry_loop.h  | 10 +++++-----
+ src/radix224_main_carry_loop.h  | 17 ++++++++---------
+ src/radix240_main_carry_loop.h  | 19 ++++++++++---------
+ src/radix256_main_carry_loop.h  | 10 +++++-----
+ src/radix32_main_carry_loop.h   |  6 +++---
+ src/radix4032_main_carry_loop.h | 21 ++++++++++-----------
+ src/radix56_main_carry_loop.h   | 10 +++++-----
+ src/radix60_main_carry_loop.h   | 12 ++++++------
+ src/radix64_main_carry_loop.h   |  6 +++---
+ src/radix960_main_carry_loop.h  | 22 ++++++++++++++--------
+ 12 files changed, 82 insertions(+), 78 deletions(-)
+
+diff --git a/src/radix1008_main_carry_loop.h b/src/radix1008_main_carry_loop.h
+index 25cdc2c..525d29c 100644
+--- a/src/radix1008_main_carry_loop.h
++++ b/src/radix1008_main_carry_loop.h
+@@ -389,14 +389,14 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 		// icycle[ic],icycle[ic+1],icycle[ic+2],icycle[ic+3], jcycle[ic],kcycle[ic],lcycle[ic] of the non-looped version with
+ 		// icycle[ic],icycle[jc],icycle[kc],icycle[lc], jcycle[ic],kcycle[ic],lcycle[ic] :
+ 		ic = 0; jc = 1; kc = 2; lc = 3;
+-		while(tm0 < isrt2,two)	// Can't use l for loop index here since need it for byte offset in carry macro call
++		while(tm0 < two)	// Can't use l for loop index here since need it for byte offset in carry macro call
+ 		{																/* vvvvvvvvvvvvvvv [1,2,3]*ODD_RADIX; assumed << l2_sz_vd on input: */
+ 			//See "Sep 2014" note in 32-bit SSE2 version of this code below
+ 			k1 = icycle[ic];	k5 = jcycle[ic];	k6 = kcycle[ic];	k7 = lcycle[ic];
+ 			k2 = icycle[jc];
+ 			k3 = icycle[kc];
+ 			k4 = icycle[lc];
+-			tm2 = a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++			tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)]);	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+ 																		/* vvvvvvvvvvvvvvv [1,2,3]*ODD_RADIX; assumed << l2_sz_vd on input: */
+ 			SSE2_fermat_carry_norm_errcheck_X4_hiacc(tm0,tmp,l,tm1,0x1f80, 0x7e0,0xfc0,0x17a0, half_arr,sign_mask,k1,k2,k3,k4,k5,k6,k7, add0,p1,p2,p3);
+ 			tm0 += 8; tm1++; tmp += 8; l -= 0xc0;
+@@ -417,7 +417,7 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 			k2 = icycle[jc];
+ 			k3 = icycle[kc];
+ 			k4 = icycle[lc];
+-			tm2 = a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++			tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)]);	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+ 																		/* vvvvvvvvvvvvvvv [1,2,3]*ODD_RADIX; assumed << l2_sz_vd on input: */
+ 			SSE2_fermat_carry_norm_errcheck_X4_loacc(tm0,tmp,tm1,0x1f80, 0x7e0,0xfc0,0x17a0, half_arr,sign_mask,k1,k2,k3,k4,k5,k6,k7, add0,p1,p2,p3);
+ 			tm0 += 8; tm1++;
+@@ -447,15 +447,15 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 		ic = 0; jc = 1;
+ 		tm1 = s1p00; tmp = cy_r;	// <*** Again rely on contiguity of cy_r,i here ***
+ 		l = ODD_RADIX;	// Need to stick this #def into an intvar to work around [error: invalid lvalue in asm input for constraint 'm']
+-		while(tm1 < isrt2) {
++		while((int)(tmp-cy_r) < RADIX) {
+ 			//See "Sep 2014" note in 32-bit SSE2 version of this code below
+ 			k1 = icycle[ic];
+ 			k2 = jcycle[ic];
+ 			k3 = icycle[jc];
+ 			k4 = jcycle[jc];
+ 			// Each SSE2 carry macro call also processes 2 prefetches of main-array data
+-			tm2 = a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+-			tm2 += (-((int)(tm1-cy_r)&0x1)) & p2;	// Base-addr incr by extra p2 on odd-index passes
++			tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[(int)(tmp-cy_r)>>2]);	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++			tm2 += (-((int)((tmp-cy_r)>>1)&0x1)) & p2;	// Base-addr incr by extra p2 on odd-index passes
+ 			SSE2_fermat_carry_norm_errcheck_X2(tm1,tmp,NRT_BITS,NRTM1,idx_offset,idx_incr,l,half_arr,sign_mask,add1,add2,k1,k2,k3,k4, tm2,p1);
+ 			tm1 += 4; tmp += 2;
+ 			MOD_ADD32(ic, 2, ODD_RADIX, ic);
+@@ -468,16 +468,15 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 		ic = 0;	// ic = idx into [i|j]cycle mini-arrays, gets incremented (mod ODD_RADIX) between macro calls
+ 		tm1 = s1p00; tmp = cy_r;	// <*** Again rely on contiguity of cy_r,i here ***
+ 		l = ODD_RADIX << 4;	// 32-bit version needs preshifted << 4 input value
+-		while(tm1 < isrt2) {
++		while((int)(tmp-cy_r) < RADIX) {
+ 			//Sep 2014: Even with reduced-register version of the 32-bit Fermat-mod carry macro,
+ 			// GCC runs out of registers on this one, without some playing-around-with-alternate code-sequences ...
+ 			// Pulling the array-refs out of the carry-macro call like so solves the problem:
+ 			k1 = icycle[ic];
+ 			k2 = jcycle[ic];
+-			// Each SSE2 carry macro call also processes 2 prefetches of main-array data
+-			tm2 = a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+-			tm2 += (-(l&0x10)) & p2;
+-			tm2 += (-(l&0x01)) & p1;	// Added offset cycles among p0,1,2,3
++			// Each SSE2 carry macro call also processes 1 prefetch of main-array data
++			tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[(int)(tmp-cy_r)>>2]);	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++			tm2 += p1*((int)(tmp-cy_r)&0x3);	// Added offset cycles among p0,1,2,3
+ 			SSE2_fermat_carry_norm_errcheck(tm1,tmp,NRT_BITS,NRTM1,idx_offset,idx_incr,l,half_arr,sign_mask,add1,add2,k1,k2, tm2);
+ 			tm1 += 2; tmp++;
+ 			MOD_ADD32(ic, 1, ODD_RADIX, ic);
+diff --git a/src/radix1024_main_carry_loop.h b/src/radix1024_main_carry_loop.h
+index 6b2e8ae..d43b47c 100644
+--- a/src/radix1024_main_carry_loop.h
++++ b/src/radix1024_main_carry_loop.h
+@@ -384,7 +384,7 @@ normally be getting dispatched to [radix] separate blocks of the A-array, we nee
+ 	  #if (OS_BITS == 32)
+ 		for(l = 0; l < RADIX; l++) {	// RADIX loop passes
+ 			// Each SSE2 carry macro call also processes 1 prefetch of main-array data
+-			add0 = a + j1 + pfetch_dist + poff[l];	// poff[] = p0,4,8,...
++			add0 = a + j1 + pfetch_dist + poff[l>>2];	// poff[] = p0,4,8,...
+ 			add0 += (-(l&0x10)) & p2;
+ 			add0 += (-(l&0x01)) & p1;
+ 			SSE2_fermat_carry_norm_pow2_errcheck   (tm1,tmp,NRT_BITS,NRTM1,idx_offset,idx_incr,half_arr,sign_mask,add1,add2, add0);
+@@ -393,7 +393,7 @@ normally be getting dispatched to [radix] separate blocks of the A-array, we nee
+ 	  #else	// 64-bit SSE2
+ 		for(l = 0; l < RADIX>>1; l++) {	// RADIX/2 loop passes
+ 			// Each SSE2 carry macro call also processes 2 prefetches of main-array data
+-			add0 = a + j1 + pfetch_dist + poff[l];	// poff[] = p0,4,8,...
++			add0 = a + j1 + pfetch_dist + poff[l>>1];	// poff[] = p0,4,8,...
+ 			add0 += (-(l&0x1)) & p2;	// Base-addr incr by extra p2 on odd-index passes
+ 			SSE2_fermat_carry_norm_pow2_errcheck_X2(tm1,tmp,NRT_BITS,NRTM1,idx_offset,idx_incr,half_arr,sign_mask,add1,add2, add0,p2);
+ 			tm1 += 4; tmp += 2;
+@@ -427,7 +427,7 @@ normally be getting dispatched to [radix] separate blocks of the A-array, we nee
+ 			SSE2_RADIX_64_DIF( FALSE, thr_id,
+ 				4,	// set = trailz(N) - trailz(64)
+ 				// Input pointer; no offsets array in pow2-radix case:
+-				s1p00 + (jt<<1), 0x0,
++				(double *)(s1p00 + (jt<<1)), 0x0,
+ 				// Intermediates-storage pointer:
+ 				vd00,
+ 				// Outputs: Base address plus index offsets:
+diff --git a/src/radix128_main_carry_loop.h b/src/radix128_main_carry_loop.h
+index ff92238..24cb836 100644
+--- a/src/radix128_main_carry_loop.h
++++ b/src/radix128_main_carry_loop.h
+@@ -571,7 +571,7 @@ normally be getting dispatched to [radix] separate blocks of the A-array, we nee
+ 	  #if (OS_BITS == 32)
+ 		for(l = 0; l < RADIX; l++) {	// RADIX loop passes
+ 			// Each SSE2 carry macro call also processes 1 prefetch of main-array data
+-			tm2 = a + j1 + pfetch_dist + poff[l];	// poff[] = p0,4,8,...
++			tm2 = (vec_dbl *)( + pfetch_dist + poff[l>>2]);	// poff[] = p0,4,8,...
+ 			tm2 += (-(l&0x10)) & p02;
+ 			tm2 += (-(l&0x01)) & p01;
+ 			SSE2_fermat_carry_norm_pow2_errcheck   (tm1,tmp,NRT_BITS,NRTM1,idx_offset,idx_incr,half_arr,sign_mask,add1,add2, tm2);
+@@ -580,7 +580,7 @@ normally be getting dispatched to [radix] separate blocks of the A-array, we nee
+ 	  #else	// 64-bit SSE2
+ 		for(l = 0; l < RADIX>>1; l++) {	// RADIX/2 loop passes
+ 			// Each SSE2 carry macro call also processes 2 prefetches of main-array data
+-			tm2 = a + j1 + pfetch_dist + poff[l];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++			tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[l>>1]);	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+ 			tm2 += (-(l&0x1)) & p02;	// Base-addr incr by extra p2 on odd-index passes
+ 			SSE2_fermat_carry_norm_pow2_errcheck_X2(tm1,tmp,NRT_BITS,NRTM1,idx_offset,idx_incr,half_arr,sign_mask,add1,add2, tm2,p01);
+ 			tm1 += 4; tmp += 2;
+@@ -592,7 +592,7 @@ normally be getting dispatched to [radix] separate blocks of the A-array, we nee
+ 		// Can't use l as loop index here, since it gets used in the Fermat-mod carry macro (as are k1,k2);
+ 		ntmp = 0; addr = cy_r; addi = cy_i;
+ 		for(m = 0; m < RADIX>>2; m++) {
+-			jt = j1 + poff[m]; jp = j2 + poff[m];	// poff[] = p04,08,...,60
++			jt = j1 + poff[m]; jp = j2 + poff[m];	// poff[] = p04,08,...
+ 			fermat_carry_norm_pow2_errcheck(a[jt    ],a[jp    ],*addr,*addi,ntmp,NRTM1,NRT_BITS);	ntmp += NDIVR; ++addr; ++addi;
+ 			fermat_carry_norm_pow2_errcheck(a[jt+p01],a[jp+p01],*addr,*addi,ntmp,NRTM1,NRT_BITS);	ntmp += NDIVR; ++addr; ++addi;
+ 			fermat_carry_norm_pow2_errcheck(a[jt+p02],a[jp+p02],*addr,*addi,ntmp,NRTM1,NRT_BITS);	ntmp += NDIVR; ++addr; ++addi;
+@@ -634,8 +634,8 @@ normally be getting dispatched to [radix] separate blocks of the A-array, we nee
+ 		k1 = reverse(l,8)<<1;
+ 		tm2 = s1p00 + k1;
+ 	#if (OS_BITS == 32)
+-								 add1 = (vec_dbl*)tm1+ 2; add2 = (vec_dbl*)tm1+ 4; add3 = (vec_dbl*)tm1+ 6; add4 = (vec_dbl*)tm1+ 8; add5 = (vec_dbl*)tm1+10; add6 = (vec_dbl*)tm1+12; add7 = (vec_dbl*)tm1+14;
+-		add8 = (vec_dbl*)tm1+16; add9 = (vec_dbl*)tm1+18; adda = (vec_dbl*)tm1+20; addb = (vec_dbl*)tm1+22; addc = (vec_dbl*)tm1+24; addd = (vec_dbl*)tm1+26; adde = (vec_dbl*)tm1+28; addf = (vec_dbl*)tm1+30;
++								  add1 = (double*)(tm1+ 2); add2 = (double*)(tm1+ 4); add3 = (double*)(tm1+ 6); add4 = (double*)(tm1+ 8); add5 = (double*)(tm1+10); add6 = (double*)(tm1+12); add7 = (double*)(tm1+14);
++		add8 = (double*)(tm1+16); add9 = (double*)(tm1+18); adda = (double*)(tm1+20); addb = (double*)(tm1+22); addc = (double*)(tm1+24); addd = (double*)(tm1+26); adde = (double*)(tm1+28); addf = (double*)(tm1+30);
+ 		SSE2_RADIX16_DIF_0TWIDDLE  (tm2,OFF1,OFF2,OFF3,OFF4, tmp,two, tm1,add1,add2,add3,add4,add5,add6,add7,add8,add9,adda,addb,addc,addd,adde,addf);
+ 	#else
+ 		SSE2_RADIX16_DIF_0TWIDDLE_B(tm2,OFF1,OFF2,OFF3,OFF4, tmp,two, tm1);
+diff --git a/src/radix224_main_carry_loop.h b/src/radix224_main_carry_loop.h
+index 1ad55e5..ead8a83 100644
+--- a/src/radix224_main_carry_loop.h
++++ b/src/radix224_main_carry_loop.h
+@@ -398,7 +398,7 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 			k3 = icycle[kc];
+ 			k4 = icycle[lc];
+ 			// Each AVX carry macro call also processes 4 prefetches of main-array data
+-			tm2 = a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++			tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)]);	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+ 																		/* vvvvvvvvvvvvvvv [1,2,3]*ODD_RADIX; assumed << l2_sz_vd on input: */
+ 			SSE2_fermat_carry_norm_errcheck_X4_hiacc(tm0,tmp,l,tm1,0x700, 0xe0,0x1c0,0x2a0, half_arr,sign_mask,k1,k2,k3,k4,k5,k6,k7, tm2,p1,p2,p3);
+ 			tm0 += 8; tm1++; tmp += 8; l -= 0xc0;
+@@ -420,7 +420,7 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 			k3 = icycle[kc];
+ 			k4 = icycle[lc];
+ 			// Each AVX carry macro call also processes 4 prefetches of main-array data
+-			tm2 = a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++			tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)]);	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+ 																		/* vvvvvvvvvvvvvvv [1,2,3]*ODD_RADIX; assumed << l2_sz_vd on input: */
+ 			SSE2_fermat_carry_norm_errcheck_X4_loacc(tm0,tmp,tm1,0x700, 0xe0,0x1c0,0x2a0, half_arr,sign_mask,k1,k2,k3,k4,k5,k6,k7, tm2,p1,p2,p3);
+ 			tm0 += 8; tm1++;
+@@ -448,15 +448,15 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 		ic = 0; jc = 1;
+ 		tm1 = s1p00; tmp = cy_r;	// <*** Again rely on contiguity of cy_r,i here ***
+ 		l = ODD_RADIX;	// Need to stick this #def into an intvar to work around [error: invalid lvalue in asm input for constraint 'm']
+-		while(tm1 < isrt2) {
++		while((int)(tmp-cy_r) < RADIX) {
+ 			//See "Sep 2014" note in 32-bit SSE2 version of this code below
+ 			k1 = icycle[ic];
+ 			k2 = jcycle[ic];
+ 			int k3 = icycle[jc];
+ 			int k4 = jcycle[jc];
+ 			// Each SSE2 carry macro call also processes 2 prefetches of main-array data
+-			tm2 = a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+-			tm2 += (-((int)(tm1-cy_r)&0x1)) & p2;	// Base-addr incr by extra p2 on odd-index passes
++			tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[(int)(tmp-cy_r)>>2]);	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++			tm2 += (-((int)((tmp-cy_r)>>1)&0x1)) & p2;	// Base-addr incr by extra p2 on odd-index passes
+ 			SSE2_fermat_carry_norm_errcheck_X2(tm1,tmp,NRT_BITS,NRTM1,idx_offset,idx_incr,l,half_arr,sign_mask,add1,add2,k1,k2,k3,k4, tm2,p1);
+ 			tm1 += 4; tmp += 2;
+ 			MOD_ADD32(ic, 2, ODD_RADIX, ic);
+@@ -470,16 +470,15 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 		tm1 = s1p00; tmp = cy_r;	// <*** Again rely on contiguity of cy_r,i here ***
+ 		// Need to stick this #def into an intvar to work around [error: invalid lvalue in asm input for constraint 'm']
+ 		l = ODD_RADIX << 4;	// 32-bit version needs preshifted << 4 input value
+-		while(tm1 < isrt2) {
++		while((int)(tmp-cy_r) < RADIX) {
+ 			//Sep 2014: Even with reduced-register version of the 32-bit Fermat-mod carry macro,
+ 			// GCC runs out of registers on this one, without some playing-around-with-alternate code-sequences ...
+ 			// Pulling the array-refs out of the carry-macro call like so solves the problem:
+ 			k1 = icycle[ic];
+ 			k2 = jcycle[ic];
+ 			// Each SSE2 carry macro call also processes 2 prefetches of main-array data
+-			tm2 = a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+-			tm2 += (-(l&0x10)) & p2;
+-			tm2 += (-(l&0x01)) & p1;	// Added offset cycles among p0,1,2,3
++			tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[(int)(tmp-cy_r)>>2]);	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++			tm2 += p1*((int)(tmp-cy_r)&0x3);	// Added offset cycles among p0,1,2,3
+ 			SSE2_fermat_carry_norm_errcheck(tm1,tmp,NRT_BITS,NRTM1,idx_offset,idx_incr,l,half_arr,sign_mask,add1,add2,k1,k2, tm2);
+ 			tm1 += 2; tmp++;
+ 			MOD_ADD32(ic, 1, ODD_RADIX, ic);
+diff --git a/src/radix240_main_carry_loop.h b/src/radix240_main_carry_loop.h
+index 2278d29..6f8e0f4 100644
+--- a/src/radix240_main_carry_loop.h
++++ b/src/radix240_main_carry_loop.h
+@@ -608,14 +608,15 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 		// icycle[ic],icycle[jc],icycle[kc],icycle[lc], jcycle[ic],kcycle[ic],lcycle[ic] :
+ 		ic = 0; jc = 1; kc = 2; lc = 3;
+ 		while(tm0 < s1pef)	// Can't use l for loop index here since need it for byte offset in carry macro call
+-		{
++		{	// NB: (int)(tmp-cy_r) < RADIX (as used for SSE2 build) no good here, since just 1 vec_dbl increment
++			// per 4 Re+Im-carries; but (int)(tmp-cy_r) < (RADIX>>1) would work
+ 			//See "Sep 2014" note in 32-bit SSE2 version of this code below
+ 			k1 = icycle[ic];	k5 = jcycle[ic];	k6 = kcycle[ic];	k7 = lcycle[ic];
+ 			k2 = icycle[jc];
+ 			k3 = icycle[kc];
+ 			k4 = icycle[lc];
+ 			// Each AVX carry macro call also processes 4 prefetches of main-array data
+-			tm2 = a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++			tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)]);	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+ 																		/* vvvvvvvvvvvvvvv [1,2,3]*ODD_RADIX; assumed << l2_sz_vd on input: */
+ 			SSE2_fermat_carry_norm_errcheck_X4_hiacc(tm0,tmp,l,tm1,0x780, 0x1e0,0x3c0,0x5a0, half_arr,sign_mask,k1,k2,k3,k4,k5,k6,k7, tm2,p1,p2,p3);
+ 			tm0 += 8; tm1++; tmp += 8; l -= 0xc0;
+@@ -691,7 +692,7 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 				k3 = icycle[kc];
+ 				k4 = icycle[lc];
+ 				// Each AVX carry macro call also processes 4 prefetches of main-array data
+-				tm2 = a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++				tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)]);	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+ 																			/* vvvvvvvvvvvvvvv [1,2,3]*ODD_RADIX; assumed << l2_sz_vd on input: */
+ 				SSE2_fermat_carry_norm_errcheck_X4_loacc(tm0,tmp,tm1,0x780, 0x1e0,0x3c0,0x5a0, half_arr,sign_mask,k1,k2,k3,k4,k5,k6,k7, tm2,p1,p2,p3);
+ 				tm0 += 8; tm1++;
+@@ -722,15 +723,15 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 		ic = 0; jc = 1;
+ 		tm1 = s1p00; tmp = cy_r;	// <*** Again rely on contiguity of cy_r,i here ***
+ 		l = ODD_RADIX;	// Need to stick this #def into an intvar to work around [error: invalid lvalue in asm input for constraint 'm']
+-		while(tm1 < s1pef) {
++		while((int)(tmp-cy_r) < RADIX) {
+ 			//See "Sep 2014" note in 32-bit SSE2 version of this code below
+ 			k1 = icycle[ic];
+ 			k2 = jcycle[ic];
+ 			int k3 = icycle[jc];
+ 			int k4 = jcycle[jc];
+ 			// Each SSE2 carry macro call also processes 2 prefetches of main-array data
+-			tm2 = a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+-			tm2 += (-((int)(tm1-cy_r)&0x1)) & p2;	// Base-addr incr by extra p2 on odd-index passes
++			tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[(int)(tmp-cy_r)>>2]);	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++			tm2 += (-((int)((tmp-cy_r)>>1)&0x1)) & p2;	// Base-addr incr by extra p2 on odd-index passes
+ 			SSE2_fermat_carry_norm_errcheck_X2(tm1,tmp,NRT_BITS,NRTM1,idx_offset,idx_incr,l,half_arr,sign_mask,add1,add2,k1,k2,k3,k4, tm2,p1);
+ 			tm1 += 4; tmp += 2;
+ 			MOD_ADD32(ic, 2, ODD_RADIX, ic);
+@@ -744,15 +745,15 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 		tm1 = s1p00; tmp = cy_r;	// <*** Again rely on contiguity of cy_r,i here ***
+ 		// Need to stick this #def into an intvar to work around [error: invalid lvalue in asm input for constraint 'm']
+ 		l = ODD_RADIX << 4;	// 32-bit version needs preshifted << 4 input value
+-		while(tm1 <= s1pef) {
++		while((int)(tmp-cy_r) < RADIX) {
+ 			//Sep 2014: Even with reduced-register version of the 32-bit Fermat-mod carry macro,
+ 			// GCC runs out of registers on this one, without some playing-around-with-alternate code-sequences ...
+ 			// Pulling the array-refs out of the carry-macro call like so solves the problem:
+ 			k1 = icycle[ic];
+ 			k2 = jcycle[ic];
+ 			// Each SSE2 carry macro call also processes 2 prefetches of main-array data
+-			tm2 = a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+-			tm2 += plo[(int)(tm1-cy_r)&0x3];	// Added offset cycles among p0,1,2,3
++			tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[(int)(tmp-cy_r)>>2]);	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++			tm2 += p1*((int)(tmp-cy_r)&0x3);	// Added offset cycles among p0,1,2,3
+ 			SSE2_fermat_carry_norm_errcheck(tm1,tmp,NRT_BITS,NRTM1,idx_offset,idx_incr,l,half_arr,sign_mask,add1,add2,k1,k2, tm2);
+ 			tm1 += 2; tmp++;
+ 			MOD_ADD32(ic, 1, ODD_RADIX, ic);
+diff --git a/src/radix256_main_carry_loop.h b/src/radix256_main_carry_loop.h
+index d439f24..aff7f38 100644
+--- a/src/radix256_main_carry_loop.h
++++ b/src/radix256_main_carry_loop.h
+@@ -558,7 +558,7 @@ normally be getting dispatched to [radix] separate blocks of the A-array, we nee
+ 	  #if (OS_BITS == 32)
+ 		for(l = 0; l < RADIX; l++) {	// RADIX loop passes
+ 			// Each SSE2 carry macro call also processes 1 prefetch of main-array data
+-			add0 = a + j1 + pfetch_dist + poff[l];	// poff[] = p0,4,8,...
++			add0 = a + j1 + pfetch_dist + poff[l>>2];	// poff[] = p0,4,8,...
+ 			add0 += (-(l&0x10)) & p02;
+ 			add0 += (-(l&0x01)) & p01;
+ 			SSE2_fermat_carry_norm_pow2_errcheck   (tm1,tmp,NRT_BITS,NRTM1,idx_offset,idx_incr,half_arr,sign_mask,add1,add2, add0);
+@@ -567,7 +567,7 @@ normally be getting dispatched to [radix] separate blocks of the A-array, we nee
+ 	  #else	// 64-bit SSE2
+ 		for(l = 0; l < RADIX>>1; l++) {	// RADIX/2 loop passes
+ 			// Each SSE2 carry macro call also processes 2 prefetches of main-array data
+-			add0 = a + j1 + pfetch_dist + poff[l];	// poff[] = p0,4,8,...
++			add0 = a + j1 + pfetch_dist + poff[l>>1];	// poff[] = p0,4,8,...
+ 			add0 += (-(l&0x1)) & p02;	// Base-addr incr by extra p2 on odd-index passes
+ 			SSE2_fermat_carry_norm_pow2_errcheck_X2(tm1,tmp,NRT_BITS,NRTM1,idx_offset,idx_incr,half_arr,sign_mask,add1,add2, add0,p01);
+ 			tm1 += 4; tmp += 2;
+@@ -579,7 +579,7 @@ normally be getting dispatched to [radix] separate blocks of the A-array, we nee
+ 		// Can't use l as loop index here, since it gets used in the Fermat-mod carry macro (as are k1,k2):
+ 		ntmp = 0; addr = cy_r; addi = cy_i;
+ 		for(m = 0; m < RADIX>>2; m++) {
+-			jt = j1 + poff[m]; jp = j2 + poff[m];	// poff[] = p04,08,...,60
++			jt = j1 + poff[m]; jp = j2 + poff[m];	// poff[] = p04,08,...
+ 			fermat_carry_norm_pow2_errcheck(a[jt    ],a[jp    ],*addr,*addi,ntmp,NRTM1,NRT_BITS);	ntmp += NDIVR; ++addr; ++addi;
+ 			fermat_carry_norm_pow2_errcheck(a[jt+p01],a[jp+p01],*addr,*addi,ntmp,NRTM1,NRT_BITS);	ntmp += NDIVR; ++addr; ++addi;
+ 			fermat_carry_norm_pow2_errcheck(a[jt+p02],a[jp+p02],*addr,*addi,ntmp,NRTM1,NRT_BITS);	ntmp += NDIVR; ++addr; ++addi;
+@@ -629,8 +629,8 @@ normally be getting dispatched to [radix] separate blocks of the A-array, we nee
+ 		k1 = reverse(l,16)<<1;
+ 		tm2 = s1p00 + k1;
+ 	#if (OS_BITS == 32)
+-								 add1 = (vec_dbl*)tmp+ 2; add2 = (vec_dbl*)tmp+ 4; add3 = (vec_dbl*)tmp+ 6; add4 = (vec_dbl*)tmp+ 8; add5 = (vec_dbl*)tmp+10; add6 = (vec_dbl*)tmp+12; add7 = (vec_dbl*)tmp+14;
+-		add8 = (vec_dbl*)tmp+16; add9 = (vec_dbl*)tmp+18; adda = (vec_dbl*)tmp+20; addb = (vec_dbl*)tmp+22; addc = (vec_dbl*)tmp+24; addd = (vec_dbl*)tmp+26; adde = (vec_dbl*)tmp+28; addf = (vec_dbl*)tmp+30;
++								  add1 = (double*)(tmp+ 2); add2 = (double*)(tmp+ 4); add3 = (double*)(tmp+ 6); add4 = (double*)(tmp+ 8); add5 = (double*)(tmp+10); add6 = (double*)(tmp+12); add7 = (double*)(tmp+14);
++		add8 = (double*)(tmp+16); add9 = (double*)(tmp+18); adda = (double*)(tmp+20); addb = (double*)(tmp+22); addc = (double*)(tmp+24); addd = (double*)(tmp+26); adde = (double*)(tmp+28); addf = (double*)(tmp+30);
+ 		SSE2_RADIX16_DIF_0TWIDDLE  (tm2,OFF1,OFF2,OFF3,OFF4, isrt2,two, tmp,add1,add2,add3,add4,add5,add6,add7,add8,add9,adda,addb,addc,addd,adde,addf);
+ 	#else
+ 		SSE2_RADIX16_DIF_0TWIDDLE_B(tm2,OFF1,OFF2,OFF3,OFF4, isrt2,two, tmp);
+diff --git a/src/radix32_main_carry_loop.h b/src/radix32_main_carry_loop.h
+index 5337009..3f0d0a0 100644
+--- a/src/radix32_main_carry_loop.h
++++ b/src/radix32_main_carry_loop.h
+@@ -291,7 +291,7 @@ normally be getting dispatched to [radix] separate blocks of the A-array, we nee
+ 	  #if (OS_BITS == 32)
+ 		for(l = 0; l < RADIX; l++) {	// RADIX loop passes
+ 			// Each SSE2 carry macro call also processes 1 prefetch of main-array data
+-			tm2 = a + j1 + pfetch_dist + poff[l];	// poff[] = p0,4,8,...
++			tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[l>>2]);	// poff[] = p0,4,8,...
+ 			tm2 += (-(l&0x10)) & p02;
+ 			tm2 += (-(l&0x01)) & p01;
+ 			SSE2_fermat_carry_norm_pow2_errcheck   (tm1,tmp,NRT_BITS,NRTM1,idx_offset,idx_incr,half_arr,sign_mask,add1,add2, tm2);
+@@ -300,7 +300,7 @@ normally be getting dispatched to [radix] separate blocks of the A-array, we nee
+ 	  #else	// 64-bit SSE2
+ 		for(l = 0; l < RADIX>>1; l++) {	// RADIX/2 loop passes
+ 			// Each SSE2 carry macro call also processes 2 prefetches of main-array data
+-			tm2 = a + j1 + pfetch_dist + poff[l];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++			tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[l>>1]);	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+ 			tm2 += (-(l&0x1)) & p02;	// Base-addr incr by extra p2 on odd-index passes
+ 			SSE2_fermat_carry_norm_pow2_errcheck_X2(tm1,tmp,NRT_BITS,NRTM1,idx_offset,idx_incr,half_arr,sign_mask,add1,add2, tm2,p01);
+ 			tm1 += 4; tmp += 2;
+@@ -312,7 +312,7 @@ normally be getting dispatched to [radix] separate blocks of the A-array, we nee
+ 		// Can't use l as loop index here, since it gets used in the Fermat-mod carry macro (as are k1,k2);
+ 		ntmp = 0; addr = cy_r; addi = cy_i;
+ 		for(m = 0; m < RADIX>>2; m++) {
+-			jt = j1 + poff[m]; jp = j2 + poff[m];	// poff[] = p04,08,...,60
++			jt = j1 + poff[m]; jp = j2 + poff[m];	// poff[] = p04,08,...
+ 			fermat_carry_norm_pow2_errcheck(a[jt    ],a[jp    ],*addr,*addi,ntmp,NRTM1,NRT_BITS);	ntmp += NDIVR; ++addr; ++addi;
+ 			fermat_carry_norm_pow2_errcheck(a[jt+p01],a[jp+p01],*addr,*addi,ntmp,NRTM1,NRT_BITS);	ntmp += NDIVR; ++addr; ++addi;
+ 			fermat_carry_norm_pow2_errcheck(a[jt+p02],a[jp+p02],*addr,*addi,ntmp,NRTM1,NRT_BITS);	ntmp += NDIVR; ++addr; ++addi;
+diff --git a/src/radix4032_main_carry_loop.h b/src/radix4032_main_carry_loop.h
+index 3e68bb2..ac02d50 100644
+--- a/src/radix4032_main_carry_loop.h
++++ b/src/radix4032_main_carry_loop.h
+@@ -371,7 +371,7 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 			k2 = icycle[jc];
+ 			k3 = icycle[kc];
+ 			k4 = icycle[lc];
+-			tm2 = a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++			tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)]);	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+ 																		/* vvvvvvvvvvvvvvv [1,2,3]*ODD_RADIX; assumed << l2_sz_vd on input: */
+ 			SSE2_fermat_carry_norm_errcheck_X4_hiacc(tm0,tmp,l,tm1,0x7e00, 0x1f80,0x3f00,0x5e80, half_arr,sign_mask,k1,k2,k3,k4,k5,k6,k7, tm2,p1,p2,p3);
+ 			tm0 += 8; tm1++; tmp += 8; l -= 0xc0;
+@@ -386,14 +386,13 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 		tm0 = s1p00; tmp = base_negacyclic_root;	// tmp *not* incremented between macro calls in loacc version
+ 		tm1 = cy_r; // tm2 = cy_i;	*** replace with literal-byte-offset in macro call to save a reg
+ 		ic = 0; jc = 1; kc = 2; lc = 3;
+-		for(l = 0; l < RADIX>>2; l++)	// RADIX/4 loop passes
+-		{
++		for(l = 0; l < RADIX>>2; l++) {	// RADIX/4 loop passes
+ 			//See "Sep 2014" note in 32-bit SSE2 version of this code below
+ 			k1 = icycle[ic];	k5 = jcycle[ic];	k6 = kcycle[ic];	k7 = lcycle[ic];
+ 			k2 = icycle[jc];
+ 			k3 = icycle[kc];
+ 			k4 = icycle[lc];
+-			tm2 = a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++			tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)]);	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+ 																		/* vvvvvvvvvvvvvvv [1,2,3]*ODD_RADIX; assumed << l2_sz_vd on input: */
+ 			SSE2_fermat_carry_norm_errcheck_X4_loacc(tm0,tmp,tm1,0x7e00, 0x1f80,0x3f00,0x5e80, half_arr,sign_mask,k1,k2,k3,k4,k5,k6,k7, tm2,p1,p2,p3);
+ 			tm0 += 8; tm1++;
+@@ -423,15 +422,15 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 		ic = 0; jc = 1;
+ 		tm1 = s1p00; tmp = cy_r;	// <*** Again rely on contiguity of cy_r,i here ***
+ 		l = ODD_RADIX;	// Need to stick this #def into an intvar to work around [error: invalid lvalue in asm input for constraint 'm']
+-		while(tm1 < cy_r) {
++		while((int)(tmp-cy_r) < RADIX) {
+ 			//See "Sep 2014" note in 32-bit SSE2 version of this code below
+ 			k1 = icycle[ic];
+ 			k2 = jcycle[ic];
+ 			int k3 = icycle[jc];
+ 			int k4 = jcycle[jc];
+ 			// Each SSE2 carry macro call also processes 2 prefetches of main-array data
+-			tm2 = a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+-			tm2 += (-((int)(tm1-cy_r)&0x1)) & p2;	// Base-addr incr by extra p2 on odd-index passes
++			tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[(int)(tmp-cy_r)>>2]);	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++			tm2 += (-((int)((tmp-cy_r)>>1)&0x1)) & p2;	// Base-addr incr by extra p2 on odd-index passes
+ 			SSE2_fermat_carry_norm_errcheck_X2(tm1,tmp,NRT_BITS,NRTM1,idx_offset,idx_incr,l,half_arr,sign_mask,add1,add2,k1,k2,k3,k4, tm2,p1);
+ 			tm1 += 4; tmp += 2;
+ 			MOD_ADD32(ic, 2, ODD_RADIX, ic);
+@@ -444,15 +443,15 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 		ic = 0;	// ic = idx into [i|j]cycle mini-arrays, gets incremented (mod ODD_RADIX) between macro calls
+ 		tm1 = s1p00; tmp = cy_r;	// <*** Again rely on contiguity of cy_r,i here ***
+ 		l = ODD_RADIX << 4;	// 32-bit version needs preshifted << 4 input value
+-		while(tm1 < cy_r) {
++		while((int)(tmp-cy_r) < RADIX) {
+ 			//Sep 2014: Even with reduced-register version of the 32-bit Fermat-mod carry macro,
+ 			// GCC runs out of registers on this one, without some playing-around-with-alternate code-sequences ...
+ 			// Pulling the array-refs out of the carry-macro call like so solves the problem:
+ 			k1 = icycle[ic];
+ 			k2 = jcycle[ic];
+ 			// Each SSE2 carry macro call also processes 1 prefetch of main-array data
+-			tm2 = a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+-			tm2 += plo[(int)(tm1-cy_r)&0x3];	// Added offset cycles among p0,1,2,3
++			tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[(int)(tmp-cy_r)>>2]);	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++			tm2 += p1*((int)(tmp-cy_r)&0x3);	// Added offset cycles among p0,1,2,3
+ 			SSE2_fermat_carry_norm_errcheck(tm1,tmp,NRT_BITS,NRTM1,idx_offset,idx_incr,l,half_arr,sign_mask,add1,add2,k1,k2, tm2);
+ 			tm1 += 2; tmp++;
+ 			MOD_ADD32(ic, 1, ODD_RADIX, ic);
+@@ -531,7 +530,7 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 			// the leading pow2-shift arg = trailz(N) - trailz(64) = 0:
+ 			SSE2_RADIX_64_DIF( FALSE, thr_id,
+ 				0,
+-				tmp,t_offsets,
++				(double *)tmp,t_offsets,
+ 				s1p00,	// tmp-storage
+ 				a+jt,io_offsets
+ 			); tmp += 2;
+diff --git a/src/radix56_main_carry_loop.h b/src/radix56_main_carry_loop.h
+index 7e6ba9f..6e395fa 100644
+--- a/src/radix56_main_carry_loop.h
++++ b/src/radix56_main_carry_loop.h
+@@ -434,7 +434,7 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 			k3 = icycle[kc];
+ 			k4 = icycle[lc];
+ 			// Each AVX carry macro call also processes 4 prefetches of main-array data
+-			tm2 = a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++			tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)]);	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+ 																		/* vvvvvvvvvvvvvvv [1,2,3]*ODD_RADIX; assumed << l2_sz_vd on input: */
+ 			SSE2_fermat_carry_norm_errcheck_X4_loacc(tm0,tmp,tm1,0x1c0, 0xe0,0x1c0,0x2a0, half_arr,sign_mask,k1,k2,k3,k4,k5,k6,k7, tm2,p01,p02,p03);
+ 			tm0 += 8; tm1++;
+@@ -469,8 +469,8 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 			int k3 = icycle[jc];
+ 			int k4 = jcycle[jc];
+ 			// Each SSE2 carry macro call also processes 2 prefetches of main-array data
+-			tm2 = a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+-			tm2 += (-((int)(tm1-cy_r)&0x1)) & p02;	// Base-addr incr by extra p2 on odd-index passes
++			tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[(int)(tmp-cy_r)>>2]);	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++			tm2 += (-((int)((tmp-cy_r)>>1)&0x1)) & p02;	// Base-addr incr by extra p2 on odd-index passes
+ 			SSE2_fermat_carry_norm_errcheck_X2(tm1,tmp,NRT_BITS,NRTM1,idx_offset,idx_incr,l,half_arr,sign_mask,add1,add2,k1,k2,k3,k4, tm2,p01);
+ 			tm1 += 4; tmp += 2;
+ 			MOD_ADD32(ic, 2, ODD_RADIX, ic);
+@@ -491,8 +491,8 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 			k1 = icycle[ic];
+ 			k2 = jcycle[ic];
+ 			// Each SSE2 carry macro call also processes 2 prefetches of main-array data
+-			tm2 = a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+-			tm2 += p01*((int)(tm1-cy_r)&0x3);	// Added offset cycles among p0,1,2,3
++			tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[(int)(tmp-cy_r)>>2]);	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++			tm2 += p01*((int)(tmp-cy_r)&0x3);	// Added offset cycles among p0,1,2,3
+ 			SSE2_fermat_carry_norm_errcheck(tm1,tmp,NRT_BITS,NRTM1,idx_offset,idx_incr,l,half_arr,sign_mask,add1,add2,k1,k2, tm2);
+ 			tm1 += 2; tmp++;
+ 			MOD_ADD32(ic, 1, ODD_RADIX, ic);
+diff --git a/src/radix60_main_carry_loop.h b/src/radix60_main_carry_loop.h
+index 187ec3f..d4ad69b 100644
+--- a/src/radix60_main_carry_loop.h
++++ b/src/radix60_main_carry_loop.h
+@@ -424,7 +424,7 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 			k3 = icycle[kc];
+ 			k4 = icycle[lc];
+ 			// Each AVX carry macro call also processes 4 prefetches of main-array data
+-			tm2 = a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++			tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)]);	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+ 																		/* vvvvvvvvvvvvvvv [1,2,3]*ODD_RADIX; assumed << l2_sz_vd on input: */
+ 			SSE2_fermat_carry_norm_errcheck_X4_hiacc(tm0,tmp,l,tm1,0x1e0, 0x1e0,0x3c0,0x5a0, half_arr,sign_mask,k1,k2,k3,k4,k5,k6,k7, tm2,p01,p02,p03);
+ 			tm0 += 8; tm1++; tmp += 8; l -= 0xc0;
+@@ -446,7 +446,7 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 			k3 = icycle[kc];
+ 			k4 = icycle[lc];
+ 			// Each AVX carry macro call also processes 4 prefetches of main-array data
+-			tm2 = a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++			tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)]);	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+ 																		/* vvvvvvvvvvvvvvv [1,2,3]*ODD_RADIX; assumed << l2_sz_vd on input: */
+ 			SSE2_fermat_carry_norm_errcheck_X4_loacc(tm0,tmp,tm1,0x1e0, 0x1e0,0x3c0,0x5a0, half_arr,sign_mask,k1,k2,k3,k4,k5,k6,k7, tm2,p01,p02,p03);
+ 			tm0 += 8; tm1++;
+@@ -483,8 +483,8 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 			int k3 = icycle[jc];
+ 			int k4 = jcycle[jc];
+ 			// Each SSE2 carry macro call also processes 2 prefetches of main-array data
+-			tm2 = a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+-			tm2 += (-((int)(tm1-cy_r)&0x1)) & p02;	// Base-addr incr by extra p2 on odd-index passes
++			tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[(int)(tmp-cy_r)>>2]);	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++			tm2 += (-((int)((tmp-cy_r)>>1)&0x1)) & p02;	// Base-addr incr by extra p2 on odd-index passes
+ 			SSE2_fermat_carry_norm_errcheck_X2(tm1,tmp,NRT_BITS,NRTM1,idx_offset,idx_incr,l,half_arr,sign_mask,add1,add2,k1,k2,k3,k4, tm2,p01);
+ 			tm1 += 4; tmp += 2;
+ 			MOD_ADD32(ic, 2, ODD_RADIX, ic);
+@@ -505,8 +505,8 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 			k1 = icycle[ic];
+ 			k2 = jcycle[ic];
+ 			// Each SSE2 carry macro call also processes 2 prefetches of main-array data
+-			tm2 = a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+-			tm2 += p01*((int)(tm1-cy_r)&0x3);	// Added offset cycles among p0,1,2,3
++			tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[(int)(tmp-cy_r)>>2]);	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++			tm2 += p01*((int)(tmp-cy_r)&0x3);	// Added offset cycles among p0,1,2,3
+ 			SSE2_fermat_carry_norm_errcheck(tm1,tmp,NRT_BITS,NRTM1,idx_offset,idx_incr,l,half_arr,sign_mask,add1,add2,k1,k2, tm2);
+ 			tm1 += 2; tmp++;
+ 			MOD_ADD32(ic, 1, ODD_RADIX, ic);
+diff --git a/src/radix64_main_carry_loop.h b/src/radix64_main_carry_loop.h
+index ce3e4af..57bea3d 100644
+--- a/src/radix64_main_carry_loop.h
++++ b/src/radix64_main_carry_loop.h
+@@ -464,7 +464,7 @@ normally be getting dispatched to [radix] separate blocks of the A-array, we nee
+ 	  #if (OS_BITS == 32)
+ 		for(l = 0; l < RADIX; l++) {	// RADIX loop passes
+ 			// Each SSE2 carry macro call also processes 1 prefetch of main-array data
+-			tm2 = a + j1 + pfetch_dist + poff[l];	// poff[] = p0,4,8,...
++			tm2 = a + j1 + pfetch_dist + poff[l>>2];	// poff[] = p0,4,8,...
+ 			tm2 += (-(l&0x10)) & p02;
+ 			tm2 += (-(l&0x01)) & p01;
+ 			SSE2_fermat_carry_norm_pow2_errcheck   (tm1,tmp,NRT_BITS,NRTM1,idx_offset,idx_incr,half_arr,sign_mask,add1,add2, tm2);
+@@ -473,7 +473,7 @@ normally be getting dispatched to [radix] separate blocks of the A-array, we nee
+ 	  #else	// 64-bit SSE2
+ 		for(l = 0; l < RADIX>>1; l++) {	// RADIX/2 loop passes
+ 			// Each SSE2 carry macro call also processes 2 prefetches of main-array data
+-			tm2 = a + j1 + pfetch_dist + poff[l];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++			tm2 = a + j1 + pfetch_dist + poff[l>>1];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+ 			tm2 += (-(l&0x1)) & p02;	// Base-addr incr by extra p2 on odd-index passes
+ 			SSE2_fermat_carry_norm_pow2_errcheck_X2(tm1,tmp,NRT_BITS,NRTM1,idx_offset,idx_incr,half_arr,sign_mask,add1,add2, tm2,p01);
+ 			tm1 += 4; tmp += 2;
+@@ -485,7 +485,7 @@ normally be getting dispatched to [radix] separate blocks of the A-array, we nee
+ 		// Can't use l as loop index here, since it gets used in the Fermat-mod carry macro (as are k1,k2);
+ 		ntmp = 0; addr = cy_r; addi = cy_i;
+ 		for(m = 0; m < RADIX>>2; m++) {
+-			jt = j1 + poff[m]; jp = j2 + poff[m];	// poff[] = p04,08,...,60
++			jt = j1 + poff[m]; jp = j2 + poff[m];	// poff[] = p04,08,...
+ 			fermat_carry_norm_pow2_errcheck(a[jt    ],a[jp    ],*addr,*addi,ntmp,NRTM1,NRT_BITS);	ntmp += NDIVR; ++addr; ++addi;
+ 			fermat_carry_norm_pow2_errcheck(a[jt+p01],a[jp+p01],*addr,*addi,ntmp,NRTM1,NRT_BITS);	ntmp += NDIVR; ++addr; ++addi;
+ 			fermat_carry_norm_pow2_errcheck(a[jt+p02],a[jp+p02],*addr,*addi,ntmp,NRTM1,NRT_BITS);	ntmp += NDIVR; ++addr; ++addi;
+diff --git a/src/radix960_main_carry_loop.h b/src/radix960_main_carry_loop.h
+index cb4cc15..f900a77 100644
+--- a/src/radix960_main_carry_loop.h
++++ b/src/radix960_main_carry_loop.h
+@@ -589,16 +589,22 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 		// Oct 2014: Try getting most of the LOACC speedup with better accuracy by breaking the complex-roots-of-(-1)
+ 		// chaining into 2 or more equal-sized subchains, each starting with 'fresh' (unchained) complex roots:
+ 		#if (LOACC == 0)
++			#warning LOACC = 0
+ 			#define NFOLD (const int)0
+ 		#elif (LOACC == 1)
++			#warning LOACC = 1
+ 			#define NFOLD (const int)1
+ 		#elif (LOACC == 2)
++			#warning LOACC = 2
+ 			#define NFOLD (const int)2
+ 		#elif (LOACC == 3)
++			#warning LOACC = 3
+ 			#define NFOLD (const int)3
+ 		#elif (LOACC == 4)
++			#warning LOACC = 4
+ 			#define NFOLD (const int)4
+ 		#elif (LOACC == 5)
++			#warning LOACC = 5
+ 			#define NFOLD (const int)5
+ 		#else
+ 			#error If LOACC defined for build of radix960_ditN_cy_dif1.c, must be given value 0,1,2,3,4 or 5!
+@@ -650,7 +656,7 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 				k3 = icycle[kc];
+ 				k4 = icycle[lc];
+ 				// Each AVX carry macro call also processes 4 prefetches of main-array data
+-				tm2 = a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++				tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)]);	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+ 																			/* vvvvvvvvvvvvvvv [1,2,3]*ODD_RADIX; assumed << l2_sz_vd on input: */
+ 				SSE2_fermat_carry_norm_errcheck_X4_loacc(tm0,tmp,tm1,0x1e00, 0x1e0,0x3c0,0x5a0, half_arr,sign_mask,k1,k2,k3,k4,k5,k6,k7, tm2,p1,p2,p3);
+ 				tm0 += 8; tm1++;
+@@ -681,15 +687,15 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 		ic = 0; jc = 1;
+ 		tm1 = s1p00; tmp = cy_r;	// <*** Again rely on contiguity of cy_r,i here ***
+ 		l = ODD_RADIX;	// Need to stick this #def into an intvar to work around [error: invalid lvalue in asm input for constraint 'm']
+-		while(tm1 < x00) {
++		while((int)(tmp-cy_r) < RADIX) {
+ 			//See "Sep 2014" note in 32-bit SSE2 version of this code below
+ 			k1 = icycle[ic];
+ 			k2 = jcycle[ic];
+ 			k3 = icycle[jc];
+ 			k4 = jcycle[jc];
+ 			// Each SSE2 carry macro call also processes 2 prefetches of main-array data
+-			tm2 = a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+-			tm2 += (-((int)(tm1-cy_r)&0x1)) & p2;	// Base-addr incr by extra p2 on odd-index passes
++			tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[(int)(tmp-cy_r)>>2]);	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++			tm2 += (-((int)((tmp-cy_r)>>1)&0x1)) & p2;	// Base-addr incr by extra p2 on odd-index passes
+ 			SSE2_fermat_carry_norm_errcheck_X2(tm1,tmp,NRT_BITS,NRTM1,idx_offset,idx_incr,l,half_arr,sign_mask,add1,add2,k1,k2,k3,k4, tm2,p1);
+ 			tm1 += 4; tmp += 2;
+ 			MOD_ADD32(ic, 2, ODD_RADIX, ic);
+@@ -703,15 +709,15 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 		tm1 = s1p00; tmp = cy_r;	// <*** Again rely on contiguity of cy_r,i here ***
+ 		// Need to stick this #def into an intvar to work around [error: invalid lvalue in asm input for constraint 'm']
+ 		l = ODD_RADIX << 4;	// 32-bit version needs preshifted << 4 input value
+-		while(tm1 < x00) {
++		while((int)(tmp-cy_r) < RADIX) {
+ 			//Sep 2014: Even with reduced-register version of the 32-bit Fermat-mod carry macro,
+ 			// GCC runs out of registers on this one, without some playing-around-with-alternate code-sequences ...
+ 			// Pulling the array-refs out of the carry-macro call like so solves the problem:
+ 			k1 = icycle[ic];
+ 			k2 = jcycle[ic];
+ 			// Each SSE2 carry macro call also processes 2 prefetches of main-array data
+-			tm2 = a + j1 + pfetch_dist + poff[(int)(tm1-cy_r)];	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
+-			tm2 += plo[(int)(tm1-cy_r)&0x3];	// Added offset cycles among p0,1,2,3
++			tm2 = (vec_dbl *)(a + j1 + pfetch_dist + poff[(int)(tmp-cy_r)>>2]);	// poff[] = p0,4,8,...; (tm1-cy_r) acts as a linear loop index running from 0,...,RADIX-1 here.
++			tm2 += p1*((int)(tmp-cy_r)&0x3);	// Added offset cycles among p0,1,2,3
+ 			SSE2_fermat_carry_norm_errcheck(tm1,tmp,NRT_BITS,NRTM1,idx_offset,idx_incr,l,half_arr,sign_mask,add1,add2,k1,k2, tm2);
+ 			tm1 += 2; tmp++;
+ 			MOD_ADD32(ic, 1, ODD_RADIX, ic);
+@@ -982,7 +988,7 @@ for(k=1; k <= khi; k++)	/* Do n/(radix(1)*nwt) outer loop executions...	*/
+ 		// the leading pow2-shift arg = trailz(N) - trailz(64) = 0:
+ 			SSE2_RADIX_64_DIF( FALSE, thr_id,
+ 				0,
+-				tmp,t_offsets,
++				(double *)tmp,t_offsets,
+ 				s1p00,	// tmp-storage
+ 				a+jt,dif_o_offsets
+ 			); tmp += 2;
+-- 
+2.12.2
+
diff -Nru mlucas-14.1/debian/patches/0001-Split-big-test-into-smaller-ones.patch mlucas-14.1/debian/patches/0001-Split-big-test-into-smaller-ones.patch
--- mlucas-14.1/debian/patches/0001-Split-big-test-into-smaller-ones.patch	1970-01-01 08:00:00.000000000 +0800
+++ mlucas-14.1/debian/patches/0001-Split-big-test-into-smaller-ones.patch	2017-04-24 16:16:28.000000000 +0800
@@ -0,0 +1,33 @@
+From 35e426b2718af92558df61718f405c69e03bf10d Mon Sep 17 00:00:00 2001
+From: Alex Vong <alexvong1...@gmail.com>
+Date: Mon, 24 Apr 2017 14:09:01 +0800
+Subject: [PATCH] Split big test into smaller ones.
+
+Description: Split big test into smaller ones to avoid exhausting
+ system resources. This fix is inspired by that of
+ https://bugs.debian.org/860664.
+Bug-Debian: https://bugs.debian.org/860662
+Forwarded: yes
+Author: Alex Vong <alexvong1...@gmail.com>
+
+* scripts/self_test.test: Split big test.
+---
+ scripts/self_test.test | 10 ++++++++--
+ 1 file changed, 8 insertions(+), 2 deletions(-)
+
+--- a/scripts/self_test.test
++++ b/scripts/self_test.test
+@@ -29,5 +29,11 @@
+ # Export MLUCAS_PATH so that mlucas.cfg stays in the build directory
+ export MLUCAS_PATH
+ 
+-# Do self-test
+-exec "$MLUCAS_PATH"mlucas -s m
++# List of `medium' exponents
++exponent_ls='20000047 22442237 24878401 27309229 29735137 32156581 34573867 36987271 39397201 44207087 49005071 53792327 58569809 63338459 68098843 72851621 77597293 87068977 96517019 105943723 115351063 124740697 134113933 143472073'
++
++# Run self-test on `medium' exponents
++for exponent in $exponent_ls
++do
++    "$MLUCAS_PATH"mlucas -m "$exponent" -iters 100
++done
diff -Nru mlucas-14.1/debian/patches/series mlucas-14.1/debian/patches/series
--- mlucas-14.1/debian/patches/series	2015-08-28 03:58:09.000000000 +0800
+++ mlucas-14.1/debian/patches/series	2017-04-24 16:16:28.000000000 +0800
@@ -1 +1,3 @@
 0001-Add-copyright-info-of-generated-files.patch
+0001-Split-big-test-into-smaller-ones.patch
+0001-fixes-undefined-behaviour.patch
diff -Nru mlucas-14.1/debian/README.Debian mlucas-14.1/debian/README.Debian
--- mlucas-14.1/debian/README.Debian	2015-08-27 22:53:38.000000000 +0800
+++ mlucas-14.1/debian/README.Debian	2017-04-24 16:16:28.000000000 +0800
@@ -13,6 +13,14 @@
 flag. However, the parser will not reject unsupported arguments. Using
 unsupported arguments for -iters flag may trigger strange behaviour.
 
+On system with limited resources, the self-test for medium exponents
+'mlucas -s m' may fail with 'pthread_create:: Cannot allocate memory'. See
+<https://bugs.debian.org/860662> for details. The current fix is to run
+self-test on each exponent one by one instead. However, this is unsatisfactory
+since it does not prevent the user from running the self-test for medium
+exponents and getting an error.
+
 See BUGS section in mlucas(1) for details.
 
+ -- Alex Vong <alexvong1...@gmail.com>  Thu, 27 Aug 2017 22:04:58 +0800
  -- Alex Vong <alexvong1...@gmail.com>  Thu, 27 Aug 2015 22:04:58 +0800
Feel free to ask for more details.

Cheers,
Alex

unblock mlucas/14.1-2

-- System Information:
Debian Release: 9.0
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: amd64
 (x86_64)

Kernel: Linux 4.9.0-2-amd64 (SMP w/2 CPU cores)
Locale: LANG=zh_TW.UTF-8, LC_CTYPE=zh_TW.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Attachment: signature.asc
Description: PGP signature

Reply via email to