Re: [Discuss-gnuradio] volk float32->int8 kernel and thus float_to_char block round or floor, depending on VOLK machine

CEL Sat, 09 Jun 2018 12:26:20 -0700

Hi Paul,

I agree with everything you say. Float to char should behave *exactly*
like float to int and short. Will fix it in that way. Will also have a
truncating version, maybe. I've just added 3 lines of code to the
current implementation to demonstrate it's trivial to change the
rounding mode. X87 FPUs are darn mighty things.


This brings me to another aspect: As far as I can tell from this height
is that VOLK code absolutely neglects to configure the FPU to a known
state. If the calling thread decides to do a   

_MM_SET_ROUNDING_MODE(_MM_ROUND_TOWARD_ZERO);

before calling volk_32f_s32f_convert_8i, all the kernels behave like
the truncate-to-0 generic kernel. That can't be what we want, nor is it
documented.

So, that's a bigger issue with VOLK: on one hand, we'd want to set that
SIMD FPU control register (MXCSR) on every entry into a volk kernel
that uses rounding, under- or overflowing SIMD intrinsics to have
defined behaviour. Since we are nice programmers, we don't want to have
surprising side effects, so we'd restore it to its original state on
exit... I see pipeline flush and performance bottleneck right there,
but it's IA64 calling convention to save and restore as callee. 

Best regards,
Marcus
On Sat, 2018-06-09 at 20:18 +0200, Paul Boven wrote:
> Hi Marcus,
> 
> I would prefer that when going from float to int, every 'bin' should 
> have equal size. So I can think of two ways to do that:
> 
> 1) zero corresponds to [-0.5 : 0.49999999]
> 
> or
> 
> 2) zero corresponds to [0.0 : 0.999999]
> 
> whereas the 'generic' optimization does
> 
> 3) zero corresponds to [-1 to 0.999999]
> 
> The second was actually the behaviour I was expecting, and I was 
> pleasantly surprised when GnuRadio seemed to do the first - but then 
> occasionally it doesn't.
> 
> I just did a quick test in python3, and there, the range of int(x)
> for 
> zero runs [-0.999999 : 0.999999], so I'm expecting most programming 
> languages to behave that way.
> 
> So, I guess a programmer would expect the behaviour as in the third 
> case. Someone who is converting radio signals might want either the 
> first or second case, as otherwise you end up with some interesting 
> non-linearities.
> 
> The gnuplot helpfile states: " The `int(x)` function returns the
> integer 
> part of its argument, truncated toward zero."
> 
> But gnuplot also provides functions like 'floor(x)' and ceil(x).
> 
> So the real question is still, do we want the behaviour of int(x),
> or 
> the behaviour that an analog to digital converter would have?
> 
> Finally, I'd say we want the behaviour to be the same for Int, Short
> and 
> Char. So I ran a few more tests.
> 
> With Volk enabled, Float to Int, Short and Char treat [-0.5 :
> 0.499999] 
> as zero, with the occasional glitch for Char.
> 
> volk_32f_s32f_convert_16i a_avx u_avx
> volk_32f_s32f_convert_32i a_avx u_sse2
> 
> With these conversions for Int and Short also switched to 'generic 
> generic', I get the same results:
> 
> Short zero: [-0.5 : 0.499999]
> Int zero:   [-0.5 : 0.499999]
> 
> So, assuming I carried out these tests correctly, the odd one out
> seems 
> to be the generic case for float to char conversion.
> 
> Note that Volk in the 16 bit and 32 bit case uses a function called 
> 'rintf' in the conversions. From its manpage:
> 
> "(...) round  their argument  to  an integer value in floating-point 
> format".
> 
> So I say this boils down to a bug in the 'generic' float to char
> case.
> 
> This is not a fix to make lightly, because somebody is going to have 
> their flowchart break because of this. Then again, in the current 
> situation the outcome depends on which optimizations your machine 
> happens to have available, so that's also quite bad.
> 
> Possibly there is also an optimization opportunity of never using 
> 'generic' even at the block edge when acceleration is available by 
> choosing the right size of blocks, but that's probably a small gain.
> 
> Regards, Paul Boven.
> 
> 
> 
> On 06/09/2018 07:18 PM, Müller, Marcus (CEL) wrote:
> > Hi Paul,
> > 
> > yes, this seems to be the case where the "naive" C implementation
> > behaves differently from all the SIMD ones:
> > 
> > As far as I know – but I'm desparately looking for any standards
> > document that specifies that – doing a
> > 
> > int8_t val = (int8_t) 8.8f;
> > 
> > will always lead to 8, whereas
> > 
> > int8_t val = (int8_t) -8.8f;
> > 
> > would always lead to -8.
> > 
> > Now, for the conversion operations used in the SIMD kernels, it
> > depends
> > on specific flags being set in FPU-control registers (MXCSR, it
> > seems).
> > Ummmm. Noone set these to give identical results as the native C
> > conversion above, so if I set the tolerance in the kernel unit test
> > to
> > 0 instead of 1 (which it always should have been), I get a whole
> > lot of
> > failures. Great.
> > 
> > Normally, we'd stick with the what the generic version of a kernel
> > gives us.
> > 
> > I'd do that here, too. But: that would lead to a non-rounding
> > behaviour... I'm breaking someone's porcelain here, no matter what
> > I
> > do.
> > 
> > Any ideas?
> > 
> > Best regards,
> > Marcus
> > 
> > On Sat, 2018-06-09 at 18:24 +0200, Paul Boven wrote:
> > > Hi Marcus,
> > > 
> > > Just reran the test after editing volk_config, and the result is
> > > somewhat surprising:
> > > 
> > > Every float in [-1:1] now converts to zero. Every float in [1:2]
> > > now
> > > converts to 1. Whereas it should be [-0.5:0.5] and [0.5:1.5].
> > > 
> > > It seems that most of the time, the u_sse2 converter is used, but
> > > at
> > > the
> > > end of each multiple of 8192 bytes, a few are done with the
> > > 'generic'
> > > converter - that would match perfectly with the observed
> > > behaviour.
> > > 
> > > It was also pointed out to me (on irc, unfortunately no longer in
> > > my
> > > history) that it is strange that for some acceleration types,
> > > there
> > > is a
> > > cast to int16_t instead of int8_t at the end of the routine,
> > > e.g.:
> > > 
> > > https://github.com/gnuradio/volk/blob/master/kernels/volk/volk_32
> > > f_s3
> > > 2f_convert_8i.h#L200
> > > 
> > > I had not really looked into that before because having run the
> > > volk_profile seemed to make no difference.
> > > 
> > > Regards, Paul Boven.
> > > 
> > > On 06/09/2018 06:08 PM, Müller, Marcus (CEL) wrote:
> > > > I can reproduce these, but do the errors disappear for you if
> > > > you
> > > > replace "u_sse2 u_sse2" with "generic generic" on that line?
> > > > 
> > > > 
> > > > Best regards,
> > > > Marcus
> > > > On Sat, 2018-06-09 at 18:04 +0200, Paul Boven wrote:
> > > > > Hi Marcus,
> > > > > 
> > > > > This machine did not yet have a volk_config when I ran these
> > > > > tests.
> > > > > 
> > > > > I have since run volk_profile and rebooted, and the Float-
> > > > > >Char
> > > > > quantization bug still occurs.
> > > > > 
> > > > > $ volk-config-info --machine
> > > > > avx2_64_mmx_orc
> > > > > 
> > > > > $ grep volk_32f_s32f_convert_8i .volk/volk_config
> > > > > volk_32f_s32f_convert_8i u_sse2 u_sse2
> > > > > 
> > > > > Regards, Paul Boven.
> > > > > 
> > > > > On 06/09/2018 05:30 PM, Müller, Marcus (CEL) wrote:
> > > > > > Hi Paul,
> > > > > > 
> > > > > > hm, OK, considering the actual conversion is done in VOLK,
> > > > > > can
> > > > > > you
> > > > > > tell
> > > > > > us
> > > > > > 
> > > > > > * whether ~/.volk/volk_config exists (and if so, its
> > > > > > contents
> > > > > > regarding
> > > > > > volk_32f_s32f_convert_8i )
> > > > > > * what the output of `volk-config-info --machine` is?
> > > > > > 
> > > > > > Thanks,
> > > > > > Marcus
> > > > > > 
> > > > > > On Sat, 2018-06-09 at 17:13 +0200, Paul Boven wrote:
> > > > > > > Hi everyone,
> > > > > > > 
> > > > > > > I'm trying to perform 2 bit sampling of an RF signal. In
> > > > > > > one
> > > > > > > approach,
> > > > > > > I'm using a float->char block, and noticed that around
> > > > > > > zero,
> > > > > > > a
> > > > > > > number
> > > > > > > of
> > > > > > > float inputs become quantized in a bin that's one off
> > > > > > > from
> > > > > > > the
> > > > > > > correct
> > > > > > > value. The ones that are wrong are always off by one,
> > > > > > > with
> > > > > > > their
> > > > > > > quantization error always in the direction of zero.
> > > > > > > 
> > > > > > > The problem can be demonstrated with a really simple
> > > > > > > flowchart,
> > > > > > > using
> > > > > > > the following blocks:
> > > > > > > 
> > > > > > > * Noise Source (Noise Type: Gaussian, Amplitude: 1, Seed:
> > > > > > > 0,
> > > > > > > Output
> > > > > > > type: Float)
> > > > > > > * Throttle
> > > > > > > The throttle is then connected to two blocks:
> > > > > > > * A file-sink (Type Float) and a
> > > > > > > * 'Float to Char' block.
> > > > > > > * The float to char block is again connected to a File
> > > > > > > Sink,
> > > > > > > now
> > > > > > > of
> > > > > > > type
> > > > > > > Char.
> > > > > > > 
> > > > > > > As an example, I've plotted all the samples that
> > > > > > > quantized as
> > > > > > > zero.
> > > > > > > These should fall in the range [-0.5:0.5], but
> > > > > > > occasionally
> > > > > > > we
> > > > > > > also
> > > > > > > get
> > > > > > > hits that lie within [-1:1]. These mishaps are rare
> > > > > > > (about
> > > > > > > one in
> > > > > > > 2000).
> > > > > > > It also shows that they only occur at multiples of 8192
> > > > > > > samples,
> > > > > > > and
> > > > > > > zooming in reveals that they always happen shortly before
> > > > > > > the
> > > > > > > next
> > > > > > > multiple of 8192, not after.
> > > > > > > 
> > > > > > > For other values than 0, the same applies, but the
> > > > > > > misquantizations
> > > > > > > are
> > > > > > > only in one direction, ending up in a lower bin if the
> > > > > > > input
> > > > > > > is
> > > > > > > positive, or in a higher bin if the input is negative.
> > > > > > > Again,
> > > > > > > the
> > > > > > > misquantizations only occur in half the bin: For a value
> > > > > > > of
> > > > > > > 1,
> > > > > > > the
> > > > > > > float
> > > > > > > value should be in [0.5:1.5], but occasionally a value in
> > > > > > > [1.5:2]
> > > > > > > also
> > > > > > > ends up being quantized as 1.
> > > > > > > 
> > > > > > > This seems to me a bug that is somehow related to
> > > > > > > internal
> > > > > > > block
> > > > > > > boundaries, but I'm not familiar enough with the
> > > > > > > internals of
> > > > > > > GnuRadio
> > > > > > > to figure out just what's going wrong.
> > > > > > > 
> > > > > > > The problem does NOT occur when converting to Short or
> > > > > > > Int.
> > > > > > > 
> > > > > > > This is using GnuRadio 3.7.11 (as packaged with Ubuntu
> > > > > > > 18.04).
> > > > > > > 
> > > > > > > Regards, Paul Boven.
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > _______________________________________________
> > > > > > > Discuss-gnuradio mailing list
> > > > > > > [email protected]
> > > > > > > https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
> 
>

smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
Discuss-gnuradio mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/discuss-gnuradio

Re: [Discuss-gnuradio] volk float32->int8 kernel and thus float_to_char block round or floor, depending on VOLK machine

Reply via email to