On 7/15/19 12:35 PM, Chip Wachob wrote: > Oscar and Mats, > > Thank you for your comments and taking time to look at the snips. > > Yes, I think I had commented that the avg+trigger was = triggervolts in > my original post. > > I did find that there was an intermediary process which I had forgotten > to comment out that was adversely affecting the data in one instance and > not the other. So it WAS a case of becoming code blind. But I didn't > give y'all all of the code so you would not have known that. My apologies. > > Mats, I'd like to get a better handle on your suggestions about > improving the code. Turns out, I've got another couple of 4GByte files > to sift through, and they are less 'friendly' when it comes to > determining the start and stop points. So, I have to basically redo > about half of my code and I'd like to improve on my Python coding skills. > > Unfortunately, I have gaps in my coding time, and I end up forgetting > the details of a particular language, especially a new language to me, > Python. > > I'll admit that my 'C' background keeps me thinking as these data sets > as arrays.. in fact they are lists, eg: > > [ > [t0, v0], > [t1, v1], > [t2, v2], > . > . > . > [tn, vn] > ] > > Time and volts are floats and need to be converted from the csv file > entries. > > I'm not sure that follow the "unpack" assignment in your example of: > > for row in TrigWind: > time, voltage = row # unpack > > I think I 'see' what is happening, but when I read up on unpacking, I > see that referring to using the * and ** when passing arguments to a > function...
That's a different aspect of unpacking. This one is sequnce unpacking, sometimes called tuple (or seqeucence) assignment. In the official Python docs it is described in the latter part of this section: https://docs.python.org/3/tutorial/datastructures.html#tuples-and-sequences > I tried it anyhow, with this being an example of my source data: > > "Record Length",2000002,"Points",-0.005640001706,1.6363 > "Sample Interval",5e-09,s,-0.005639996706,1.65291 > "Trigger Point",1128000,"Samples",-0.005639991706,1.65291 > "Trigger Time",0.341197,s,-0.005639986706,1.60309 > ,,,-0.005639981706,1.60309 > "Horizontal Offset",-0.00564,s,-0.005639976706,1.6363 > ,,,-0.005639971706,1.65291 > ,,,-0.005639966706,1.65291 > ,,,-0.005639961706,1.6363 > . > . > . > > Note that I want the items in the third and fourth column of the csv > file for my time and voltage. > > When I tried to use the unpack, they all came over as strings. I can't > seem to convert them selectively.. That's what the csv module does, unless you tell it not to. Maybe this will help: https://docs.python.org/3/library/csv.html#csv.reader There's an option to convert unquoted values to floats, and leave quoted values alone as strings, which would seem to match your data above quite well. > Desc1, Val1, Desc2, TimeVal, VoltVal = row > > TimeVal and VoltVal return type of str, which makes sense. > > Must I go through yet another iteration of scanning TimeVal and VoltVal > and converting them using float() by saving them to another array? > > > Thanks for your patience. > > Chip > > > > > > > > > > On Sat, Jul 13, 2019 at 9:36 AM Mats Wichmann <m...@wichmann.us > <mailto:m...@wichmann.us>> wrote: > > On 7/11/19 8:15 AM, Chip Wachob wrote: > > kinda restating what Oscar said, he came to the same conclusions, I'm > just being a lot more wordy: > > > > So, here's where it gets interesting. And, I'm presuming that > someone out > > there knows exactly what is going on and can help me get past this > hurdle. > > Well, each snippet has some "magic" variables (from our point of view, > since we don't see where they are set up): > > 1: if(voltage > (avg + triglevel) > > 2: if((voltage > triggervolts) > > since the value you're comparing voltage to gates when you decide > there's a transition, and thus what gets added to the transition list > you're building, and the list size comes out different, and you claim > the data are the same, then guess where a process of elimination > suggests the difference is coming from? > > === > > Stylistic comment, I know this wasn't your question. > > > for row in range (len(TrigWind)): > > Don't do this. It's not a coding error giving you wrong results, but > it's not efficient and makes for harder to read code. You already have > an iterable in TrigWind. You then find the size of the iterable and use > that size to generate a range object, which you then iterate over, > producing index values which you use to index into the original > iterable. Why not skip all that? Just do > > for row in TrigWind: > > now row is actually a row, as the variable name suggests, rather than an > index you use to go retrieve the row. > > Further, the "row" entries in TrigWind are lists (or tuples, or some > other indexable iterable, we can't tell), which means you end up > indexing into two things - into the "array" to get the row, then into > the row to get the individual values. It's nicer if you unpack the rows > into variables so they can have meaningful names - indeed you already do > that with one of them. Lets you avoid code snips like "x[7][1]" > > Conceptually then, you can take this: > > for row in range(len(Trigwind)): > voltage = float(TrigWind[row][1]) > ... > edgearray.append([float(TrigWind[row][0]), > float(TrigWind[row][1])]) > ... > > and change to this: > > for row in TrigWind: > time, voltage = row # unpack > .... > edgearray.append([float)time, float(voltage)]) > > or even more compactly you can unpack directly at the top: > > for time, voltage in TrigWind: > ... > edgearray.append([float)time, float(voltage)]) > ... > > Now I left an issue to resolve with conversion - voltage is not > converted before its use in the not-shown comparisons. Does it need to > be? every usage of the values from the individual rows here uses them > immediately after converting them to float. It's usually better not to > convert all over the place, and since the creation of TrigWind is under > your own control, you should do that at the point the data enters the > program - that is as TrigWind is created; then you just consume data > from it in its intended form. But if not, just convert voltage before > using, as your original code does. You don't then need to convert > voltage a second time in the list append statements. > > for time, voltage in TrigWind: > voltage = float(voltage) > ... > edgearray.append([float)time, voltage]) > ... > > > _______________________________________________ > Tutor maillist - Tutor@python.org <mailto:Tutor@python.org> > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor