0458 ! Where I left off yesterday, after giving complex phase to my square wave, the matrix looks better, but there are still components that aren't conjugates.
(Pdb) p inverse_mat[:,6] array([-0.0625-0.0625j, 0.0625-0.0625j, -0.0625+0.0625j, -0.0625-0.0625j, 0.0625-0.0625j, -0.0625+0.0625j, -0.0625-0.0625j, 0.0625+0.0625j, -0.0625-0.0625j, 0.0625-0.0625j, -0.0625+0.0625j, -0.0625-0.0625j, 0.0625-0.0625j, -0.0625+0.0625j, -0.0625-0.0625j, 0.0625+0.0625j]) (Pdb) p freqs array([ 0. , 0.0625, 0.125 , 0.1875, 0.25 , 0.3125, 0.375 , 0.4375, -0.5 , -0.4375, -0.375 , -0.3125, -0.25 , -0.1875, -0.125 , -0.0625]) It looks to me like the first component that isn't a conjugate when negated is frequency 0.25 . 0.25 makes 0.0625-0.0625j, whereas -0.25 makes 0.0625-0.0625j: the two values are the same. I've likely made this situation by having a function that doesn't produce the property. The 0-offset sample shows it much easier: (Pdb) p inverse_mat[:,0] array([-0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j]) It looks like it has no conjugates at all. (Pdb) p create_freq2time(time_count, freqs, SINE)[:,0] array([0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j]) (Pdb) p create_freq2time(time_count, freqs, SQUARE)[:,0] array([-0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j]) My square wave never outputs a 0 value, so it is impossible for it to match its complex conjugate at time=0, which has no negative. This likely happens again at time=1, time=2, etc. (Pdb) p complex_wavelet(SQUARE, np.array([0,0,1,-1,2,-2])) array([-1.-1.j, -1.-1.j, -1.-1.j, -1.-1.j, -1.-1.j, -1.-1.j]) They're all equal, with no conjugates at all! Thinking about this a little. - i could change the wave so its minimum is 0, but then it would not make negative conjugates unless i also negated the whole wave when negative - i could make the wave a step wave that touches 0, 1, and -1, but then it wouldn't be square any more - i could also interpret the whole transform a little differently, like i did to make the matrix supporting off-frequencies in the first place, to facilitate this I'm thinking that negating the wave when it is negative would match the behavior of sine. I'm accumulating the properties of the cosine wave that let it get replaced by other functions. - even i.e. f(x) == f(-x) - zero at 90 degrees f(0.25) == 0 - negative at 180 degrees f(0.5) == -f(0) Given my current complex wavelet function is taking the modulus of x to be near zero, if I made an odd-function square wave that had a base of 0 at 0 and hit +1 at t>0 and -1 at t<0, this could be considered equivalent to a 3-valued step wave, in some way, since x would loop, or I could stop looping x. 0545 Instead of a square wave, I'm just using STEP = lambda x: COSINE(x).round() . This significantly reduces the changes I need to comprehend. I'm thinking what's notable for the real-domain forward matrix, is the frequencies need to be reconstructible from only half of them. At the moment, that means that the second half _of the actual frequency data_ must be the reversed complex conjugates of the first half. A simpler approach to this could have been to not use complex numbers at all, and instead break data into two separate values. Then it would be obvious how to engage real-domain data: you'd just remove the complex data. 0557 The assertion is still failing, but the frequencies look correct: (Pdb) p (randvec @ create_time2freq(freqs=fftfreq(complex = True, repetition_samples = 16), wavelet=STEP)).round(2) array([ 9.13-0.j , -1.17+0.32j, 0.24-0.7j , -0.43-1.37j, 0.6 -0.62j, 0.16-0.9j , 0.8 +1.07j, -0.22+0.21j, -0.32+0.j , -0.22-0.21j, 0.8 -1.07j, 0.16+0.9j , 0.6 +0.62j, -0.43+1.37j, 0.24+0.7j , -1.17-0.32j]) (Pdb) p (randvec @ unextracting_ft).round(2) array([ 9.13-0.j , -1.17+0.32j, 0.24-0.7j , -0.43-1.37j, 0.6 -0.62j, 0.16-0.9j , 0.8 +1.07j, -0.22+0.21j, -0.32-0.j ]) The problem now must be in the inverse transform matrix, which is simpler. For some reason the real frequency data is still retaining imaginary components: (Pdb) p ((randvec @ unextracting_ft) @ extracting_ift).round(2) array([0.55-0.25j, 0.4 -0.21j, 0.34+0.43j, 0.74+0.28j, 0.5 -0.12j, 0.59+0.16j, 0.61-0.27j, 0.64-0.24j, 0.96+0.19j, 0.64-0.24j, 0.61-0.27j, 0.59+0.16j, 0.5 -0.12j, 0.74+0.28j, 0.34+0.43j, 0.4 -0.21j]) (Pdb) p randvec.round(2) array([0.55, 0.72, 0.6 , 0.54, 0.42, 0.65, 0.44, 0.89, 0.96, 0.38, 0.79, 0.53, 0.57, 0.93, 0.07, 0.09]) 0614 I'm here; it's a small and closed system: 126 if not is_complex: 127 has_nyquist = bool(freqs[-1] == 0.5) 128 head_idx = 1 if has_dc_offset else 0 129 tail_idx = len(freqs) - (1 if has_nyquist else 0) 130 131 # todo: remove test data 132 full_mat = complex_wavelet(wavelet, np.outer(np.concatenate([freqs, -freqs[head_idx:tail_idx][::-1]]), offsets)) 133 time_data = np.random.random(full_mat.shape[1]) 134 extended_freq_data = time_data @ np.linalg.inv(full_mat) 135 freq_data = extended_freq_data[:len(freqs)] (Pdb) 136 import pdb; pdb.set_trace() 137 138 mat[head_idx:tail_idx,:] += complex_wavelet(wavelet, np.outer(-freqs[head_idx:tail_idx], offsets)) 139 -> mat = mat.real full_mat @ extended_freq_data == time_data correctly. However, mat @ freq_data adds imaginary components to the output. They are no longer canceling. I don't yet understand why this is, given the same parts (-freqs[head_idx,tail_idx]) are summed into the output. 0622 So, like before, I imagine a thing to do is drilling down into the matrix construction. I'll consider sample 6. 0.46147936225293185 (Pdb) p full_mat[:,6].round(3) array([ 1. -0.j , 0.223+0.975j, -0.901+0.434j, -0.623-0.782j, 0.623-0.782j, 0.901+0.434j, -0.223+0.975j, -1. +0.j , -0.223-0.975j, 0.901-0.434j, 0.623+0.782j, -0.623+0.782j, -0.901-0.434j, 0.223-0.975j, 1. -0.j , 0.223+0.975j, -0.901+0.434j, -0.623-0.782j, 0.623-0.782j, 0.901+0.434j, -0.223+0.975j, -1. +0.j , -0.223-0.975j, 0.901-0.434j, 0.623+0.782j, -0.623+0.782j, -0.901-0.434j, 0.223-0.975j]) (Pdb) p (extended_freq_data @ full_mat[:,6]).round(3) (0.461-0j) (Pdb) p time_data[6].round(3) 0.461 0635 So. Every item in that column #6, is multiplied by every frequency, and they are summed. As far as I know, that's what the code is already reproducing, so there is something I'm conceiving wrongly going on here (as is often the case). I'll start by checking the top portion. The mat that's failing is made with these 2 lines: 125 mat = complex_wavelet(wavelet, np.outer(freqs, offsets)) 138 mat[head_idx:tail_idx,:] += complex_wavelet(wavelet, np.outer(-freqs[head_idx:tail_idx], offsets)) I'll check the top portion first. 0643 I failed to negate the nyquist frequency and found that the output differed because of this: (Pdb) p (extended_freq_data[:len(freqs)] @ full_mat[:len(freqs),6]).round(3) (0.458+0.004j) (Pdb) p (extended_freq_data[:len(freqs)] @ np.outer(freqs, offsets)[:len(freqs),6]).round(3) (-0.494-0.42j) (Pdb) p (extended_freq_data[:len(freqs)-1] @ np.outer(freqs, offsets)[:len(freqs)-1,6]).round(3) (-0.209-0.42j) (Pdb) p (extended_freq_data[:len(freqs)-1] @ np.outer(freqs, offsets)[:len(freqs)-1,6]).round(3) (-0.209-0.42j) Inspecting the source code, it looks like I am also failing to negate the nyquist frequency there, too. I tried inverting it and it still does not pass. 0648
# AGPL-3 Karl Semich 2022 import numpy as np # At the moment, wavelets must be: # - Zero at zero f(0) == 0 # - Even functions f(x) == f(-x) # - Odd functions when slid by 90 degrees f(0.25 - x) == f(0.25 + x) # Presently x is passed in the modular range [0,1) but this is arbitrary, just remove it if helpful COSINE = lambda x: np.cos(2 * np.pi * x) STEP = lambda x: COSINE(x).round() def complex_wavelet(wavelet, x): return wavelet(x % 1) + wavelet((x - 0.25) % 1) * 1j def fftfreq(freq_count = None, sample_rate = None, min_freq = None, max_freq = None, dc_offset = True, complex = True, sample_time = None, repetition_time = None, repetition_samples = None, freq_sample_rate = None, freq_sample_time = None): ''' Calculates and returns the frequency bins used to convert between the time and frequency domains with a discrete Fourier transform. With no optional arguments, this function should be equivalent to numpy.fft.fftfreq . To specify frequencies in terms of repetitions / sample_count, set sample_rate to sample_count. If frequencies were in Hz (cycles/sec), this implies that sample_count should be considered to last 1 second, so the frequency becomes equal to the cycle count. Parameters: - freq_count: the number of frequency bins to generate - sample_rate: the time-domain sample rate (default: 1 sample/time_unit) - min_freq: the minimum frequency the signal contains - max_fraq: the maximum frequency the signal contains - dc_offset: whether to include a DC offset component (0 Hz) - complex: whether to generate negative frequencies - sample_time: sample_rate as the duration of a single sample - repetition_time: min_freq as a period time, possibly of a subsignal - repetition_samples: min_freq as a period size in samples - freq_sample_rate: convert to or from a different frequency-domain sample rate - freq_sample_time: freq_sample_rate as the duration of a single sample Returns a vector of sinusoid time scalings that can be used to perform or analyse a discrete Fourier transform. Multiply this vector by the time-domain sample rate to get the frequencies. ''' assert not sample_time or not sample_rate assert not freq_sample_time or not freq_sample_rate assert (min_freq, repetition_time, repetition_samples).count(None) >= 2 assert freq_count or min_freq or repetition_time or repetition_samples if sample_time is not None: sample_rate = 1 / sample_time if freq_sample_time is not None: freq_sample_rate = 1 / freq_sample_time sample_rate = sample_rate or freq_sample_rate or 1 freq_sample_rate = freq_sample_rate or sample_rate or 1 if repetition_time: min_freq = 1 / repetition_time elif repetition_samples: min_freq = freq_sample_rate / repetition_samples # here freq_count adopts the complex + DC offset value, for calculations # it might be clearer to use e.g. sample_count, at least consolidate the cases with lower if freq_count is None: freq_count = int(np.ceil(freq_sample_rate / min_freq)) else: if not dc_offset: freq_count += 1 if not complex: freq_count = int((freq_count - 1) * 2) if not min_freq: min_freq = freq_sample_rate / freq_count if not max_freq: #max_freq = freq_sample_rate / 2 max_freq = freq_count * min_freq / 2 if freq_count % 2 != 0: #max_freq -= freq_sample_rate / (2 * freq_count) max_freq -= min_freq / 2 min_freq /= sample_rate max_freq /= sample_rate if freq_count % 2 == 0: if complex: neg_freqs = np.linspace(-max_freq, -min_freq, num=freq_count // 2, endpoint=True) pos_freqs = -neg_freqs[:0:-1] else: pos_freqs = np.linspace(min_freq, max_freq, num=freq_count // 2, endpoint=True) neg_freqs = pos_freqs[:0] else: pos_freqs = np.linspace(min_freq, max_freq, num=freq_count // 2, endpoint=True) neg_freqs = -pos_freqs[::-1] if complex else pos_freqs[:0] return np.concatenate([ np.array([0] if dc_offset else []), pos_freqs, neg_freqs ]) def create_freq2time(time_count = None, freqs = None, wavelet = COSINE): ''' Creates a matrix that will perform an inverse discrete Fourier transform when it post-multiplies a vector of complex frequency magnitudes. Example time_data = spectrum @ create_freq2time(len(spectrum)) Parameters: - time_count: size of the output vector, defaults to the frequency bincount - freqs: frequency bins to convert, defaults to traditional FFT frequencies Returns: - an inverse discrete Fourier matrix of shape (len(freqs), time_count) ''' assert (time_count is not None) or (freqs is not None) if freqs is None: freqs = fftfreq(time_count) is_complex = (freqs < 0).any() has_dc_offset = bool(freqs[0] == 0) freq_count = len(freqs) if not has_dc_offset: freq_count += 1 if not is_complex: freq_count = (freq_count - 1) * 2 if time_count is None: time_count = freq_count offsets = np.arange(time_count) if not is_complex: has_nyquist = bool(freqs[-1] == 0.5) head_idx = 1 if has_dc_offset else 0 tail_idx = len(freqs) - (1 if has_nyquist else 0) mat = complex_wavelet(wavelet, np.outer(np.concatenate([freqs[:tail_idx], -freqs[tail_idx:]]), offsets)) # todo: remove test data full_mat = complex_wavelet(wavelet, np.outer(np.concatenate([freqs[:tail_idx], -freqs[tail_idx:], -freqs[head_idx:tail_idx][::-1]]), offsets)) time_data = np.random.random(full_mat.shape[1]) extended_freq_data = time_data @ np.linalg.inv(full_mat) freq_data = extended_freq_data[:len(freqs)] import pdb; pdb.set_trace() mat[head_idx:tail_idx,:] += complex_wavelet(wavelet, np.outer(-freqs[head_idx:tail_idx], offsets)) mat = mat.real else: mat = complex_wavelet(wavelet, np.outer(freqs, offsets)) return mat / freq_count # scaled to match numpy convention def create_time2freq(time_count = None, freqs = None, wavelet = COSINE): ''' Creates a matrix that will perform a forward discrete Fourier transform when it post-multiplies a vector of time series data. If time_count is too small or large, the minimal least squares solution over all the data passed will be produced. This function is equivalent to calling .pinv() on the return value of create_freq2time. If the return value is single-use, it is more efficient and accurate to use numpy.linalg.lstsq . Example spectrum = time_data @ create_time2freq(len(time_data)) Parameters: - time_count: size of the input vector, defaults to the frequency bincount - freqs: frequency bins to produce, defaults to traditional FFT frequencies Returns: - a discrete Fourier matrix of shape (time_count, len(freqs)) ''' assert (time_count is not None) or (freqs is not None) if freqs is None: freqs = fftfreq(time_count) is_complex = (freqs < 0).any() # TODO: provide for DC offsets and nyquists not at index 0 by excluding items in (0,0.5) has_dc_offset = bool(freqs[0] == 0) has_nyquist = bool(freqs[-1] == 0.5) head_idx = 1 if has_dc_offset else 0 tail_idx = len(freqs)-1 if has_nyquist else len(freqs) neg_start_idx = len(freqs) if not is_complex: freqs = np.concatenate([ freqs[:head_idx], freqs[head_idx:tail_idx], -freqs[tail_idx:neg_start_idx], -freqs[head_idx:tail_idx][::-1] ]) inverse_mat = create_freq2time(time_count, freqs, wavelet) forward_mat = np.linalg.pinv(inverse_mat) if not is_complex: forward_mat = np.concatenate([ forward_mat[:,:tail_idx], forward_mat[:,tail_idx:neg_start_idx].conj() ], axis=1) return forward_mat def peak_pair_idcs(freq_data, dc_offset=True, sum=True): dc_offset = int(dc_offset) freq_heights = abs(freq_data) # squares and sums the components if sum: while len(freq_heights).shape > 1: freq_heights = freq_height.sum(axis=0) paired_heights = freq_heights[...,dc_offset:-1] + freq_heights[...,dc_offset+1:] peak_idx = paired_heights.argmax(axis=-1, keepdims=True) + dc_offset return np.concatenate(peak_idx, peak_idx + 1, axis=-1) def improve_peak(time_data, min_freq, max_freq): freqs = fftfreq(time_data.shape[-1], min_freq = min_freq, max_freq = max_freq) freq_data = time_data @ np.linalg.inv(create_freq2time(freqs)) left, right = peak_pair_idcs(freq_data) return freq_data[left], freq_data[right] def test(): np.random.seed(0) randvec = np.random.random(16) freqs16 = fftfreq(16) ift16 = create_freq2time(16, freqs=freqs16) ft16 = create_time2freq(16) randvec2time = randvec @ ift16 randvec2freq = randvec @ ft16 randvec2ifft = np.fft.ifft(randvec) randvec2fft = np.fft.fft(randvec) assert np.allclose(freqs16, np.fft.fftfreq(16)) assert np.allclose(randvec2ifft, randvec2time) assert np.allclose(randvec2fft, randvec2freq) assert np.allclose(randvec2ifft, np.linalg.solve(ft16.T, randvec)) assert np.allclose(randvec2fft, np.linalg.solve(ift16.T, randvec)) rfreqs15t = fftfreq(repetition_samples=15, complex=False) rfreqs15f = fftfreq(15, complex=False) irft15 = create_freq2time(freqs=rfreqs15f) rft15 = create_time2freq(15, freqs=rfreqs15t) randvec2rtime15 = randvec[:15] @ irft15 randvec2rfreq15 = randvec[:15] @ rft15 randvec2irfft = np.fft.irfft(randvec[:15]) randvec2rfft = np.fft.rfft(randvec[:15]) assert np.allclose(rfreqs15t, np.fft.rfftfreq(15)) assert np.allclose(rfreqs15f, np.fft.rfftfreq(28)) assert np.allclose(randvec2rtime15, randvec2irfft) assert np.allclose(randvec2rfreq15, randvec2rfft) rfreqs16t = fftfreq(repetition_samples=16, complex=False) rfreqs16f = fftfreq(16, complex=False) irft16 = create_freq2time(freqs=rfreqs16f) rft16 = create_time2freq(16, freqs=rfreqs16t) randvec2rtime16 = randvec[:16] @ irft16 randvec2rfreq16 = randvec[:16] @ rft16 randvec2irfft = np.fft.irfft(randvec[:16]) randvec2rfft = np.fft.rfft(randvec[:16]) assert np.allclose(rfreqs16t, np.fft.rfftfreq(16)) assert np.allclose(rfreqs16f, np.fft.rfftfreq(30)) assert np.allclose(randvec2rtime16, randvec2irfft) assert np.allclose(randvec2rfreq16, randvec2rfft) # [ 0, 1, 2, 3, 4, 3, 2, 1] #irft9_16 = create_freq2time(16, fftfreq(9, complex = False)) #rft16_30 = create_time2freq(16, fftfreq(30, complex = False)) #randvec2 #assert np.allclose((randvec @ rft16) @ irft16, randvec) # sample data at a differing rate time_rate = np.random.random() * 2 freq_rate = 1.0 freqs = np.fft.fftfreq(len(randvec)) rescaling_freqs = fftfreq(len(randvec), freq_sample_rate = freq_rate, sample_rate = time_rate) rescaling_ift = create_freq2time(freqs = rescaling_freqs) rescaling_ft = create_time2freq(freqs = rescaling_freqs) rescaled_time_data = np.array([ np.mean([ randvec[freqidx] * np.exp(2j * np.pi * freqs[freqidx] * sampleidx / time_rate) for freqidx in range(len(randvec)) ]) for sampleidx in range(len(randvec)) ]) assert np.allclose(rescaled_time_data, randvec @ rescaling_ift) assert np.allclose(rescaled_time_data, np.linalg.solve(rescaling_ft.T, randvec)) unscaled_freq_data = rescaled_time_data @ rescaling_ft unscaled_time_data = unscaled_freq_data @ ift16 assert np.allclose(unscaled_freq_data, randvec) assert np.allclose(unscaled_time_data, randvec2time) assert np.allclose(np.linalg.solve(rescaling_ift.T, rescaled_time_data), randvec) # extract a repeating wave longvec = np.empty(len(randvec)*100) short_count = np.random.random() * 4 + 1 short_duration = len(longvec) / short_count for idx in range(len(longvec)): longvec[idx] = randvec[int((idx / len(longvec) * short_duration) % len(randvec))] shortspace_freqs = fftfreq(None, complex = False, dc_offset = True, repetition_samples = len(randvec)) longspace_freqs = fftfreq(len(shortspace_freqs), complex = False, dc_offset = True, repetition_samples = short_duration) assert np.allclose(longspace_freqs, shortspace_freqs / (short_duration / len(randvec))) inserting_ift = create_freq2time(len(longvec), longspace_freqs, wavelet = STEP) extracting_ft = create_time2freq(len(longvec), longspace_freqs, wavelet = STEP) extracting_ift = create_freq2time(freqs = shortspace_freqs, wavelet = STEP) unextracting_ft = create_time2freq(freqs = shortspace_freqs, wavelet = STEP) inserting_spectrum = randvec @ unextracting_ft assert np.allclose(inserting_spectrum @ extracting_ift, randvec) assert np.allclose(longvec, inserting_spectrum @ inserting_ift) extracted_freqs = longvec @ extracting_ft extracted_randvec = extracted_freqs @ extracting_ift assert np.allclose(extracted_randvec, randvec) if __name__ == '__main__': test()