Re: [osv-dev] [PATCH] scripts: upgrade to python 3 - part 1

Nadav Har'El Tue, 18 Feb 2020 08:54:15 -0800

On Tue, Feb 18, 2020 at 3:54 PM Waldek Kozaczuk <[email protected]>
wrote:


> This -
> https://stackoverflow.com/questions/27476079/why-does-2to3-change-mydict-keys-to-listmydict-keys
>  -
> might explain this "list" related changes by 2to3.
>

Yet it doesn't explain the change - it explains why it is *not* actually
needed ;-)
I know why it's needed in some specific cases where a list is required and
.keys() is no longer a list. But  it's not needed here (in a for loop) and
in most other locations where this list() was added :-(


> On Tuesday, February 18, 2020 at 8:42:47 AM UTC-5, Waldek Kozaczuk wrote:
>>
>> Hey,
>>
>> Thanks for the review.
>>
>> Let me first explain what my process was. At first, I just tried to
>> submit Matt's changes with proper signoff. But then I decided to test it a
>> bit and discovered some things were breaking. Now I do not think it was
>> because of any problems in his original script, but mostly due to the fact
>> that his changes were intertwined with other scripts and I had to change
>> those. Also, pretty modules/openjdk8-from-host/module.py (was not there
>> when Matt was changing) did not work And finally trace.py somehow came
>> along. So here we go.
>>
>> As far as mechanics goes, I retained most of the Matt's patch as is and I
>> believe he used the "Future" module. I used the 2to3 tool (more
>> specifically 2to3-2.7 in my case) first and then manually fixed any
>> problems I discovered. Most painful to fix was trace.py and all related
>> ones (scripts/osv/*.py) which I tested quite thoroughly by running trace.py
>> (I learned more about it which is great but probably way too much than what
>> I wanted to know at this point ;-)). I also tested all other tests more
>> implicitly by running scripts/build with some variations of image= (java is
>> good as it exercises a lot) and fs=.
>>
>> But you are right that I could not thoroughly test loader.py (did many
>> commands but not all). Shall we exclude it from this patch?
>>
>> Lastly, I am probably not the right person to do this upgrade exercise. I
>> do not use Python daily so I am not an expert. Lack of compiler (aka
>> 'complainer') did not make me very confident especially with those larger
>> scripts. But I did not want Matt's work to go to waste. So here we go :-)
>>
>> On Tuesday, February 18, 2020 at 5:34:58 AM UTC-5, Nadav Har'El wrote:
>>
>> Thanks. I commented with a few concerns below. I'm only really worried
>> about loader.py - the gdb script, which was already supposed to be working
>> for both Python 2 and Python 3 (although we probably didn't test it
>> recently), and I'm worried these changes are breaking it, rather than
>> improving it - and it's difficult to test because these changes change all
>> sorts of "osv" commands in gdb which I guess you didn't test individually.
>>
>> Shall we leave it out?
>>
>>
>> I ask that you please at least try to run the affected scripts in Python3
>> before "fixing" them at all. In particular, some scripts that had
>> "/usr/bin/python" at the top (and not /usr/bin/python2) at the top already
>> worked correctly for Python 3 (and Python 2) because some people already
>> had python 3 as their default python (although, it is quite likely that we
>> haven't tested this in a while so things broke).
>>
>> In most cases, I first ran 2to3 and then applied manual changes if I
>> found any problems (like related to bytes vs str type of issue - really
>> painful to find and fix). Indeed 2to3 is somewhat suspicious as it
>> sometimes put extra parentheses when there were already (I manually removed
>> those changes). Not sure about list-like related changes - all done by 2to3.
>>
>>
>> --
>> Nadav Har'El
>> [email protected]
>>
>>
>> On Mon, Feb 17, 2020 at 8:22 AM Waldemar Kozaczuk <[email protected]>
>> wrote:
>>
>> --- a/scripts/loader.py
>> +++ b/scripts/loader.py
>>
>>
>> Please make sure to test this file a bit, because it isn't part of the
>> standard build or run.
>> This is the script loaded automatically when you use gdb OSv.
>>
>> Also, I was always under the impression that this script already worked
>> on both Python 2 and
>> Python 3, because we had no idea what gdb runs for us (the "#!" in the
>> top of this script isn't relevant, right?)?
>>
>> Shall we change at least this '#!/usr/bin/python2' to '#!/usr/bin/python'
>> then?
>>
>>
>> @@ -1,4 +1,4 @@
>> -#!/usr/bin/python2
>> +#!/usr/bin/python3
>>
>>  import gdb
>>  import re
>> @@ -37,8 +37,8 @@ def phys_cast(addr, type):
>>
>>  def values(_dict):
>>      if hasattr(_dict, 'viewvalues'):
>> -        return _dict.viewvalues()
>> -    return _dict.values()
>> +        return _dict.values()
>>
>>
>> This doesn't look right - you check if there is a viewvalues() function
>> and then call values()?
>> Maybe the whole "hasattr" check isn't needed on Python 2 any more?
>>
>> +    return list(_dict.values())
>>
>>
>> I wonder why you need to convert it to a list - what's wrong with a
>> generator which isn't officially a list? Did this solve a real problem, or
>> some automatic converter suggested it?
>>
>> Changed by 2to3.
>>
>>
>>
>>  def read_vector(v):
>>      impl = v['_M_impl']
>> @@ -426,19 +426,19 @@ class osv_zfs(gdb.Command):
>>
>>          print ("\n:: ARC SIZES ::")
>>          print ("\tCurrent size:    %d (%d MB)" %
>> -               (arc_size, arc_size / 1024 / 1024))
>> +               (arc_size, arc_size // 1024 // 1024))
>>          print ("\tTarget size:     %d (%d MB)" %
>> -               (arc_target_size, arc_target_size / 1024 / 1024))
>> +               (arc_target_size, arc_target_size // 1024 // 1024))
>>          print ("\tMin target size: %d (%d MB)" %
>> -               (arc_min_size, arc_min_size / 1024 / 1024))
>> +               (arc_min_size, arc_min_size // 1024 // 1024))
>>          print ("\tMax target size: %d (%d MB)" %
>> -               (arc_max_size, arc_max_size / 1024 / 1024))
>> +               (arc_max_size, arc_max_size // 1024 // 1024))
>>
>>          print ("\n:: ARC SIZE BREAKDOWN ::")
>>          print ("\tMost recently used cache size:   %d (%d MB) (%.2f%%)" %
>> -               (arc_mru_size, arc_mru_size / 1024 / 1024, arc_mru_perc))
>> +               (arc_mru_size, arc_mru_size // 1024 // 1024,
>> arc_mru_perc))
>>          print ("\tMost frequently used cache size: %d (%d MB) (%.2f%%)" %
>> -               (arc_mfu_size, arc_mfu_size / 1024 / 1024, arc_mfu_perc))
>> +               (arc_mfu_size, arc_mfu_size // 1024 // 1024,
>> arc_mfu_perc))
>>
>>          # Cache efficiency
>>          arc_hits = get_stat_by_name(arc_stats_struct, arc_stats_cast,
>> 'arcstat_hits')
>> @@ -618,7 +618,7 @@ class osv_mmap(gdb.Command):
>>              end = ulong(vma['_range']['_end'])
>>              flags = flagstr(ulong(vma['_flags']))
>>              perm = permstr(ulong(vma['_perm']))
>> -            size = '{:<16}'.format('[%s kB]' % (ulong(end - start)/1024))
>> +            size = '{:<16}'.format('[%s kB]' % (ulong(end -
>> start)//1024))
>>
>>              if 'F' in flags:
>>                  file_vma =
>> vma.cast(gdb.lookup_type('mmu::file_vma').pointer())
>> @@ -648,7 +648,7 @@ class osv_vma_find(gdb.Command):
>>                  if start <= addr and end > addr:
>>                      flags = flagstr(ulong(vma['_flags']))
>>                      perm = permstr(ulong(vma['_perm']))
>> -                    size = '{:<16}'.format('[%s kB]' % (ulong(end -
>> start)/1024))
>> +                    size = '{:<16}'.format('[%s kB]' % (ulong(end -
>> start)//1024))
>>                      print('0x%016x -> vma 0x%016x' % (addr, vma_addr))
>>                      print('0x%016x 0x%016x %s flags=%s perm=%s' %
>> (start, end, size, flags, perm))
>>                      break
>> @@ -671,7 +671,7 @@ def ulong(x):
>>  def to_int(gdb_value):
>>      if hasattr(globals()['__builtins__'], 'long'):
>>          # For GDB with python2
>> -        return long(gdb_value)
>> +        return int(gdb_value)
>>
>>
>> Again, this change is wrong, was it automated?
>> The whole point of this code was to support *both* python 2
>> and python 3, and now we go breaking the old python 2 code,
>> but not even removing the if :-)
>>
>>      return int(gdb_value)
>>
>>  class osv_syms(gdb.Command):
>> @@ -751,7 +751,7 @@ def get_base_class_offset(gdb_type, base_class_name):
>>      name_pattern = re.escape(base_class_name) + "(<.*>)?$"
>>      for field in gdb_type.fields():
>>          if field.is_base_class and re.match(name_pattern, field.name):
>> -            return field.bitpos / 8
>> +            return field.bitpos // 8
>>
>>  def derived_from(type, base_class):
>>      return len([x for x in type.fields()
>> @@ -808,11 +808,8 @@ class intrusive_list:
>>              yield node_ptr.cast(self.node_type.pointer()).dereference()
>>              hook = hook['next_']
>>
>> -    def __nonzero__(self):
>> -        return self.root['next_'] != self.root.address
>> -
>>
>>
>> Any reason to remove this? (I don't know why it was needed, but why
>> remove?)
>>
>> Changed by 2to3.
>>
>>
>>      def __bool__(self):
>> -        return self.__nonzero__()
>> +        return self.root['next_'] != self.root.address
>>
>>  class vmstate(object):
>>      def __init__(self):
>> @@ -832,7 +829,7 @@ class vmstate(object):
>>          self.cpu_list = cpu_list
>>
>>      def load_thread_list(self):
>> -        threads = map(gdb.Value.dereference,
>> unordered_map(gdb.lookup_global_symbol('sched::thread_map').value()))
>> +        threads = list(map(gdb.Value.dereference,
>> unordered_map(gdb.lookup_global_symbol('sched::thread_map').value())))
>>
>>
>> I don't understand why this change is needed... sorted() should work just
>> fine on an interable, it doesn't need a list as far as I know.
>> Did something now work without this change?
>>
>> Changed by 2to3.
>>
>>
>>          self.thread_list = sorted(threads, key=lambda x: int(x["_id"]))
>>
>>      def cpu_from_thread(self, thread):
>> @@ -896,7 +893,7 @@ def show_thread_timers(t):
>>          gdb.write('  timers:')
>>          for timer in timer_list:
>>              expired = '*' if timer['_state'] == timer_state_expired else
>> ''
>> -            expiration = int(timer['_time']['__d']['__r']) / 1.0e9
>> +            expiration = int(timer['_time']['__d']['__r']) // 1.0e9
>>              gdb.write(' %11.9f%s' % (expiration, expired))
>>          gdb.write('\n')
>>
>> @@ -911,7 +908,7 @@ class ResolvedFrame:
>>          self.frame = frame
>>          self.file_name = file_name
>>          self.line = line
>> -        self.func_name = func_name
>> +        self.__name__ = func_name
>>
>>
>> Why rename this field??
>>
>> Changed by 2to3.
>>
>>
>>
>>  def traverse_resolved_frames(frame):
>>      while frame:
>> @@ -989,14 +986,14 @@ class osv_info_threads(gdb.Command):
>>                  function_whitelist = [sched_thread_join]
>>
>>                  def is_interesting(resolved_frame):
>> -                    is_whitelisted = resolved_frame.func_name in
>> function_whitelist
>> +                    is_whitelisted = resolved_frame.__name__ in
>> function_whitelist
>>                      is_blacklisted =
>> os.path.basename(resolved_frame.file_name) in file_blacklist
>>                      return is_whitelisted or not is_blacklisted
>>
>>                  fr = find_or_give_last(is_interesting,
>> traverse_resolved_frames(newest_frame))
>>
>>                  if fr:
>> -                    location = '%s at %s:%s' % (fr.func_name,
>> strip_dotdot(fr.file_name), fr.line)
>> +                    location = '%s at %s:%s' % (fr.__name__,
>> strip_dotdot(fr.file_name), fr.line)
>>                  else:
>>                      location = '??'
>>
>> @@ -1009,7 +1006,7 @@ class osv_info_threads(gdb.Command):
>>                             )
>>                            )
>>
>> -                if fr and fr.func_name == sched_thread_join:
>> +                if fr and fr.__name__ == sched_thread_join:
>>                      gdb.write("\tjoining on %s\n" %
>> fr.frame.read_var("this"))
>>
>>                  show_thread_timers(t)
>> @@ -1176,6 +1173,7 @@ def all_traces():
>>          max_trace = ulong(trace_buffer['_size'])
>>
>>          if not trace_log_base:
>> +            print('!!! Could not find any trace data! Make sure
>> "--trace" option matches some tracepoints.')
>>              raise StopIteration
>>
>>          trace_log = inf.read_memory(trace_log_base, max_trace)
>> @@ -1214,7 +1212,7 @@ def all_traces():
>>              unpacker.align_up(8)
>>              yield Trace(tp, Thread(thread, thread_name), time, cpu,
>> data, backtrace=backtrace)
>>
>> -    iters = map(lambda cpu: one_cpu_trace(cpu), values(state.cpu_list))
>> +    iters = [one_cpu_trace(cpu) for cpu in values(state.cpu_list)]
>>
>>      return heapq.merge(*iters)
>>
>>  def save_traces_to_file(filename):
>> @@ -1281,7 +1279,7 @@ def show_leak():
>>      gdb.flush()
>>      allocs = []
>>      for i in range(size_allocations):
>> -        newpercent = '%2d%%' % round(100.0*i/(size_allocations-1))
>> +        newpercent = '%2d%%' % round(100.0*i//(size_allocations-1))
>>
>>
>> It seems like *all* divisions in this file were converted to integer
>> divisions. Are you sure this was the intent in all of them?
>> Here in particular, it's pretty clear by the 100.0 that floating point
>> division was intended!
>>
>> Changed by 2to3. But you are right this does not look correct.
>>
>>
>>
>>
>>          if newpercent != percent:
>>              percent = newpercent
>>              gdb.write('\b\b\b%s' % newpercent)
>> @@ -1343,10 +1341,10 @@ def show_leak():
>>                     allocations=cur_n,
>>                     minsize=cur_min_size,
>>                     maxsize=cur_max_size,
>> -                   avgsize=cur_total_size/cur_n,
>> +                   avgsize=cur_total_size//cur_n,
>>
>>                     minbirth=cur_first_seq,
>>                     maxbirth=cur_last_seq,
>> -                   avgbirth=cur_total_seq/cur_n,
>> +                   avgbirth=cur_total_seq//cur_n,
>>                     callchain=callchain)
>>          records.append(r)
>>          cur_n = 0
>> @@ -1538,7 +1536,7 @@ class osv_percpu(gdb.Command):
>>              gdb.write('%s\n'%e)
>>              return
>>          percpu_addr = percpu.address
>> -        for cpu in vmstate().cpu_list.values():
>> +        for cpu in list(vmstate().cpu_list.values()):
>>              gdb.write("CPU %d:\n" % cpu.id)
>>              base = cpu.obj['percpu_base']
>>              addr = base+to_int(percpu_addr)
>>
>>
>>
>>
>> diff --git a/scripts/module.py b/scripts/module.py
>> index 548dd0c7..c253968c 100755
>> --- a/scripts/module.py
>> +++ b/scripts/module.py
>> @@ -1,4 +1,4 @@
>> -#!/usr/bin/python
>> +#!/usr/bin/python3
>>
>>  import re
>>  import os
>> @@ -226,7 +226,7 @@ def build(args):
>>          else:
>>              print(prefix)
>>
>> -    for module, run_config_name in modules_to_run.items():
>> +    for module, run_config_name in list(modules_to_run.items()):
>>
>>
>> This smells like an automated change - I don't see any reason why this
>> list is needed!
>>
>> Can you please review all these changes? Better yet, just run the code
>> with python3 and see if anything *really* doesn't work...
>>
>> Will do.
>>
>>
>>
>>
>> diff --git a/scripts/osv/prof.py b/scripts/osv/prof.py
>> index 95db15b8..465a223f 100644
>> --- a/scripts/osv/prof.py
>> +++ b/scripts/osv/prof.py
>> @@ -51,7 +51,7 @@ time_units = [
>>  ]
>>
>>  def parse_time_as_nanos(text, default_unit='ns'):
>> -    for level, name in sorted(time_units, key=lambda (level, name):
>> -len(name)):
>> +    for level, name in sorted(time_units, key=lambda level_name:
>> -len(level_name[1])):
>>
>>
>> I don't understand the changes in this file, they don't look like simple
>> syntax changes... Did you review them?
>>
>> 2to3.  I may have not tested trace.py with min_duration argument.
>>
>>
>>          if text.endswith(name):
>>              return float(text.rstrip(name)) * level
>>      for level, name in time_units:
>> @@ -60,7 +60,7 @@ def parse_time_as_nanos(text, default_unit='ns'):
>>      raise Exception('Unknown unit: ' + default_unit)
>>
>>  def format_time(time, format="%.2f %s"):
>> -    for level, name in sorted(time_units, key=lambda (level, name):
>> -level):
>> +    for level, name in sorted(time_units, key=lambda level_name1:
>> -level_name1[0]):
>>          if time >= level:
>>              return format % (float(time) / level, name)
>>      return str(time)
>> @@ -207,10 +207,16 @@ class timed_trace_producer(object):
>>          self.last_time = None
>>
>>      def __call__(self, sample):
>> +        if not sample.time:
>> +            return
>> +
>>          if not sample.cpu in self.earliest_trace_per_cpu:
>>              self.earliest_trace_per_cpu[sample.cpu] = sample
>>
>> -        self.last_time = max(self.last_time, sample.time)
>> +        if not self.last_time:
>> +            self.last_time = sample.time
>> +        else:
>> +            self.last_time = max(self.last_time, sample.time)
>>
>>          matcher = self.matcher_by_name.get(sample.name, None)
>>          if not matcher:
>> @@ -239,7 +245,7 @@ class timed_trace_producer(object):
>>              return trace.TimedTrace(entry_trace, duration)
>>
>>      def finish(self):
>> -        for sample in self.open_samples.itervalues():
>> +        for sample in self.open_samples.values():
>>              duration = self.last_time - sample.time
>>              yield trace.TimedTrace(sample, duration)
>>
>> @@ -275,7 +281,7 @@ def get_idle_profile(traces):
>>
>>      def trim_samples(cpu, end_time):
>>          if cpu.idle:
>> -            for w in cpu.waits.values():
>> +            for w in list(cpu.waits.values()):
>>                  begin = max(w.time, cpu.idle.time)
>>                  yield ProfSample(begin, w.cpu, w.thread, w.backtrace,
>> resident_time=end_time - begin)
>>
>> @@ -295,7 +301,7 @@ def get_idle_profile(traces):
>>
>>          last = t
>>
>> -    for cpu in cpus.values():
>> +    for cpu in list(cpus.values()):
>>          for s in trim_samples(cpu, t.time):
>>              yield s
>>
>> @@ -402,7 +408,7 @@ def print_profile(samples, symbol_resolver,
>> caller_oriented=False,
>>      if not order:
>>          order = lambda node: (-node.resident_time, -node.hit_count)
>>
>> -    for group, tree_root in sorted(groups.iteritems(), key=lambda
>> (thread, node): order(node)):
>> +    for group, tree_root in sorted(iter(groups.items()), key=lambda
>> thread_node: order(thread_node[1])):
>>          collapse_similar(tree_root)
>>
>>          if max_levels:
>> +++ b/scripts/osv/trace.py
>> @@ -65,7 +65,12 @@ class TimeRange(object):
>>          return self.end - self.begin
>>
>>      def intersection(self, other):
>> -        begin = max(self.begin, other.begin)
>> +        if not self.begin:
>> +            begin = other.begin
>> +        elif not other.begin:
>> +            begin = self.begin
>> +        else:
>> +            begin = max(self.begin, other.begin)
>>
>>          if self.end is None:
>>              end = other.end
>> @@ -143,11 +148,11 @@ class Trace:
>>  class TimedTrace:
>>      def __init__(self, trace, duration=None):
>>          self.trace = trace
>> -        self.duration = duration
>> +        self.duration_ = duration
>>
>>      @property
>>      def duration(self):
>> -        return self.duration
>> +        return self.duration_
>>
>>      @property
>>      def time(self):
>> @@ -183,6 +188,8 @@ def do_split_format(format_str):
>>
>>  _split_cache = {}
>>  def split_format(format_str):
>> +    if not format_str:
>> +        return []
>>      result = _split_cache.get(format_str, None)
>>      if not result:
>>          result = list(do_split_format(format_str))
>> @@ -190,7 +197,7 @@ def split_format(format_str):
>>      return result
>>
>>  formatters = {
>> -    '*': lambda bytes: '{' + ' '.join('%02x' % ord(b) for b in bytes) +
>> '}'
>> +    '*': lambda bytes: '{' + ' '.join('%02x' % b for b in bytes) + '}'
>>
>>
>> Hmm, doesn't ord(b) work in Python3 any more?
>>
>>
>> Python 2:
>> >>> f = lambda bytes: '{' + ' '.join('%02x' % ord(b) for b in bytes) + '}'
>> >>> f(b'test')
>> '{74 65 73 74}'
>>
>> Python 3:
>> >>> f = lambda bytes: '{' + ' '.join('%02x' % ord(b) for b in bytes) + '}'
>> >>> f(b'test')
>> Traceback (most recent call last):
>>   File "<stdin>", line 1, in <module>
>>   File "<stdin>", line 1, in <lambda>
>>   File "<stdin>", line 1, in <genexpr>
>> TypeError: ord() expected string of length 1, but int found
>>
>> f2 = lambda bytes: '{' + ' '.join('%02x' % b for b in bytes) + '}'
>> >>> f2(b'test')
>> '{74 65 73 74}'
>>
>> The ord still works but in the end this lambda breaks.
>>
>>
>>  }
>>
>>  def get_alignment_of(fmt):
>> @@ -238,16 +245,15 @@ class SlidingUnpacker:
>>                  size = struct.calcsize(fmt)
>>                  val, = struct.unpack_from(fmt,
>> self.buffer[self.offset:self.offset+size])
>>                  self.offset += size
>> -                values.append(val)
>> +                if fmt.startswith('50p'):
>> +                   values.append(val.decode('utf-8'))
>> +                else:
>> +                   values.append(val)
>>
>>          return tuple(values)
>>
>> -    def __nonzero__(self):
>> -        return self.offset < len(self.buffer)
>> -
>> -    # Python3
>>      def __bool__(self):
>> -        return self.__nonzero__()
>> +        return self.offset < len(self.buffer)
>>
>>  class WritingPacker:
>>      def __init__(self, writer):
>> @@ -270,7 +276,10 @@ class WritingPacker:
>>              if fmt == '*':
>>                  self.pack_blob(arg)
>>              else:
>> -                self.writer(struct.pack(fmt, arg))
>> +                if fmt == '50p':
>> +                    self.writer(struct.pack(fmt, arg.encode('utf-8')))
>> +                else:
>> +                    self.writer(struct.pack(fmt, arg))
>>                  self.offset += struct.calcsize(fmt)
>>
>>      def pack_blob(self, arg):
>> @@ -298,7 +307,7 @@ class TraceDumpReaderBase :
>>          self.endian = '<'
>>          self.file = open(filename, 'rb')
>>          try:
>> -            tag = self.file.read(4)
>> +            tag = self.file.read(4).decode()
>>              if tag == "OSVT":
>>                  endian = '>'
>>              elif tag != "TVSO":
>> @@ -347,7 +356,7 @@ class TraceDumpReaderBase :
>>
>>      def readString(self):
>>          len = self.read('H')
>> -        return self.file.read(len)
>> +        return self.file.read(len).decode()
>>
>>  class TraceDumpReader(TraceDumpReaderBase) :
>>      def __init__(self, filename):
>> @@ -378,7 +387,7 @@ class TraceDumpReader(TraceDumpReaderBase) :
>>              sig = ""
>>              for j in range(0, n_args):
>>                  arg_name = self.readString()
>> -                arg_sig = self.file.read(1)
>> +                arg_sig = self.file.read(1).decode()
>>                  if arg_sig == 'p':
>>                      arg_sig = '50p'
>>                  sig += arg_sig
>> @@ -405,7 +414,7 @@ class TraceDumpReader(TraceDumpReaderBase) :
>>
>>              backtrace = None
>>              if flags & 1:
>> -                backtrace = filter(None, unpacker.unpack('Q' *
>> self.backtrace_len))
>> +                backtrace = [_f for _f in unpacker.unpack('Q' *
>> self.backtrace_len) if _f]
>>
>>              data = unpacker.unpack(tp.signature)
>>              unpacker.align_up(8)
>> @@ -414,7 +423,7 @@ class TraceDumpReader(TraceDumpReaderBase) :
>>              yield last_trace
>>
>>      def traces(self):
>> -        iters = map(lambda data: self.oneTrace(data), self.trace_buffers)
>> +        iters = [self.oneTrace(data) for data in self.trace_buffers]
>>          return heapq.merge(*iters)
>>
>>
>> @@ -523,7 +532,7 @@ def read(buffer_view):
>>
>>      while unpacker:
>>          tp_key, thread_ptr, thread_name, time, cpu =
>> unpacker.unpack('QQ16sQI')
>> -        thread_name = thread_name.rstrip('\0')
>> +        thread_name = thread_name.rstrip(b'\0').decode('utf-8')
>>          tp = tracepoints[tp_key]
>>
>>          backtrace = []
>> @@ -551,7 +560,7 @@ def write(traces, writer):
>>                      trace.time, trace.cpu)
>>
>>          if trace.backtrace:
>> -            for frame in filter(None, trace.backtrace):
>> +            for frame in [_f for _f in trace.backtrace if _f]:
>>                  packer.pack('Q', frame)
>>          packer.pack('Q', 0)
>>
>> diff --git a/scripts/osv/tree.py b/scripts/osv/tree.py
>> index 594b00e2..86345157 100644
>> --- a/scripts/osv/tree.py
>> +++ b/scripts/osv/tree.py
>> @@ -18,11 +18,11 @@ class TreeNode(object):
>>
>>      def squash_child(self):
>>          assert self.has_only_one_child()
>> -        self.children_by_key =
>> next(self.children_by_key.itervalues()).children_by_key
>> +        self.children_by_key =
>> next(iter(self.children_by_key.values())).children_by_key
>>
>>      @property
>>      def children(self):
>> -        return self.children_by_key.itervalues()
>> +        return iter(self.children_by_key.values())
>>
>>      def has_only_one_child(self):
>>          return len(self.children_by_key) == 1
>> diff --git a/scripts/run.py b/scripts/run.py
>> index 9ab8d86f..f4452345 100755
>> --- a/scripts/run.py
>> +++ b/scripts/run.py
>> @@ -1,5 +1,5 @@
>> -#!/usr/bin/env python
>> -from __future__ import print_function
>> +#!/usr/bin/env python3
>> +
>>  import subprocess
>>  import sys
>>  import argparse
>> diff --git a/scripts/setup.py b/scripts/setup.py
>> index 2a7f1c18..a58132ab 100755
>> --- a/scripts/setup.py
>> +++ b/scripts/setup.py
>> @@ -1,8 +1,8 @@
>> -#!/usr/bin/python2
>> +#!/usr/bin/python3
>>
>>  # set up a development environment for OSv.  Run as root.
>>
>> -import sys, platform, argparse
>> +import sys, distro, argparse
>>  import subprocess
>>
>>  standard_ec2_packages = ['python-pip', 'wget']
>> @@ -319,11 +319,11 @@ parser.add_argument("-t", "--test",
>> action="store_true",
>>                      help="install packages required by testing tools")
>>  cmdargs = parser.parse_args()
>>
>> -(name, version, id) = platform.linux_distribution()
>> +(name, version, id) = distro.linux_distribution()
>>
>>  for distro in distros:
>>      if type(distro.name) == type([]):
>> -        dname = filter(lambda n: name.startswith(n), distro.name)
>> +        dname = [n for n in distro.name if name.startswith(n)]
>>          if len(dname):
>>              distro.name = dname[0]
>>          else:
>> @@ -349,5 +349,5 @@ for distro in distros:
>>          print ('Your distribution %s version %s is not supported by this
>> script' % (name, version))
>>          sys.exit(1)
>>
>> -print 'Your distribution is not supported by this script.'
>> +print('Your distribution is not supported by this script.')
>>  sys.exit(2)
>> diff --git a/scripts/test.py b/scripts/test.py
>> index 02eb4b55..0f9c07c1 100755
>> --- a/scripts/test.py
>> +++ b/scripts/test.py
>> @@ -1,4 +1,4 @@
>> -#!/usr/bin/env python
>> +#!/usr/bin/env python3
>>  import atexit
>>  import subprocess
>>  import argparse
>> @@ -81,7 +81,7 @@ def is_not_skipped(test):
>>      return test.name not in blacklist
>>
>>  def run_tests_in_single_instance():
>> -    run(filter(lambda test: not isinstance(test, TestRunnerTest), tests))
>> +    run([test for test in tests if not isinstance(test, TestRunnerTest)])
>>
>>      blacklist_tests = ' '.join(blacklist)
>>      args = run_py_args + ["-s", "-e", "/testrunner.so -b %s" %
>> (blacklist_tests)]
>> @@ -103,7 +103,7 @@ def pluralize(word, count):
>>
>>  def make_export_and_conf():
>>      export_dir = tempfile.mkdtemp(prefix='share')
>> -    os.chmod(export_dir, 0777)
>> +    os.chmod(export_dir, 0o777)
>>      (conf_fd, conf_path) = tempfile.mkstemp(prefix='export')
>>      conf = os.fdopen(conf_fd, "w")
>>      conf.write("%s 127.0.0.1(insecure,rw)\n" % export_dir)
>> @@ -155,12 +155,12 @@ def run_tests():
>>              "/tst-nfs.so --server 192.168.122.1 --share %s" %
>>              export_dir) ]
>>
>> -        line = proc.stdout.readline()
>> +        line = proc.stdout.readline().decode()
>>          while line:
>>               print(line)
>>               if "/tmp" in line:
>>                  break
>> -             line = proc.stdout.readline()
>> +             line = proc.stdout.readline().decode()
>>
>>
>>          run(tests_to_run)
>> diff --git a/scripts/tests/test_app.py b/scripts/tests/test_app.py
>> index 27112ff7..1738e755 100755
>> --- a/scripts/tests/test_app.py
>> +++ b/scripts/tests/test_app.py
>> @@ -1,7 +1,6 @@
>> -#!/usr/bin/python
>> +#!/usr/bin/python3
>>  from testing import *
>>  import argparse
>> -import subprocess
>>  from time import sleep
>>
>>  def run(command, hypervisor_name, image_path=None, line=None,
>> guest_port=None, host_port=None, input_lines=[], kill_app=False):
>> diff --git a/scripts/tests/test_app_with_test_script.py
>> b/scripts/tests/test_app_with_test_script.py
>> index 2ab1b731..8a2c7295 100755
>> --- a/scripts/tests/test_app_with_test_script.py
>> +++ b/scripts/tests/test_app_with_test_script.py
>> @@ -1,4 +1,4 @@
>> -#!/usr/bin/python
>> +#!/usr/bin/python3
>>  from testing import *
>>  import argparse
>>  import runpy
>> diff --git a/scripts/tests/test_http_app_with_curl_and_ab.py
>> b/scripts/tests/test_http_app_with_curl_and_ab.py
>> index 067fcc83..9d613d01 100755
>> --- a/scripts/tests/test_http_app_with_curl_and_ab.py
>> +++ b/scripts/tests/test_http_app_with_curl_and_ab.py
>> @@ -1,11 +1,11 @@
>> -#!/usr/bin/python
>> +#!/usr/bin/python3
>>  from testing import *
>>  import argparse
>>  import subprocess
>>  from time import sleep
>>
>>  def check_with_curl(url, expected_http_line):
>> -    output = subprocess.check_output(["curl", "-s", url])
>> +    output = subprocess.check_output(["curl", "-s", url]).decode('utf-8')
>>      print(output)
>>      if expected_http_line not in output:
>>         print("FAILED curl: wrong output")
>> @@ -39,9 +39,9 @@ def run(command, hypervisor_name, host_port,
>> guest_port, http_path, expected_htt
>>          check_with_curl(app_url, expected_http_line)
>>
>>      if no_keep_alive:
>> -        output = subprocess.check_output(["ab", "-l", "-c",
>> str(concurrency), "-n", str(count), app_url]).split('\n')
>> +        output = subprocess.check_output(["ab", "-l", "-c",
>> str(concurrency), "-n", str(count), app_url]).decode('utf-8').split('\n')
>>      else:
>> -        output = subprocess.check_output(["ab", "-l", "-k", "-c",
>> str(concurrency), "-n", str(count), app_url]).split('\n')
>> +        output = subprocess.check_output(["ab", "-l", "-k", "-c",
>> str(concurrency), "-n", str(count), app_url]).decode('utf-8').split('\n')
>>
>>      failed_requests = 1
>>      complete_requests = 0
>> @@ -74,11 +74,11 @@ def run(command, hypervisor_name, host_port,
>> guest_port, http_path, expected_htt
>>              success = False
>>
>>      if failed_requests > 0:
>> -        print("FAILED ab - encountered failed requests: %d" %
>> failed_requests)
>> +        print("FAILED ab - encountered failed requests: %d" %
>> failed_requests)
>>          success = False
>>
>>      if complete_requests < count:
>> -        print("FAILED ab - too few complete requests : %d ? %d" %
>> (complete_requests, count))
>> +        print("FAILED ab - too few complete requests : %d ? %d" %
>> (complete_requests, count))
>>          success = False
>>
>>      if success:
>> diff --git a/scripts/tests/testing.py b/scripts/tests/testing.py
>> index c5249753..c3aaf218 100644
>> --- a/scripts/tests/testing.py
>> +++ b/scripts/tests/testing.py
>> @@ -133,7 +133,7 @@ class SupervisedProcess:
>>              self.cv.release()
>>
>>          line = ''
>> -        ch_bytes = ''
>> +        ch_bytes = bytes()
>>          while True:
>>              ch_bytes = ch_bytes + self.process.stdout.read(1)
>>              try:
>> @@ -144,7 +144,7 @@ class SupervisedProcess:
>>                  if ch == '\n':
>>                      append_line(line)
>>                      line = ''
>> -                ch_bytes = ''
>> +                ch_bytes = bytes()
>>              except UnicodeError:
>>                  continue
>>
>> diff --git a/scripts/trace.py b/scripts/trace.py
>> index 34cfb2ab..71dc47d5 100755
>> --- a/scripts/trace.py
>> +++ b/scripts/trace.py
>> @@ -1,4 +1,4 @@
>> -#!/usr/bin/env python2
>> +#!/usr/bin/env python3
>>  import sys
>>  import errno
>>  import argparse
>> @@ -13,6 +13,7 @@ from collections import defaultdict
>>  from osv import trace, debug, prof
>>  from osv.client import Client
>>  import memory_analyzer
>> +from functools import reduce
>>
>>  class InvalidArgumentsException(Exception):
>>      def __init__(self, message):
>> @@ -114,7 +115,7 @@ def list_trace(args):
>>      with get_trace_reader(args) as reader:
>>          for t in reader.get_traces():
>>              if t.time in time_range:
>> -                print t.format(backtrace_formatter,
>> data_formatter=data_formatter)
>> +                print(t.format(backtrace_formatter,
>> data_formatter=data_formatter))
>>
>>  def mem_analys(args):
>>      mallocs = {}
>> @@ -164,7 +165,7 @@ def add_profile_options(parser):
>>      add_time_slicing_options(parser)
>>      group = parser.add_argument_group('profile options')
>>      group.add_argument("-r", "--caller-oriented", action='store_true',
>> help="change orientation to caller-based; reverses order of frames")
>> -    group.add_argument("-g", "--group-by", choices=groupers.keys(),
>> default='none', help="group samples by given criteria")
>> +    group.add_argument("-g", "--group-by",
>> choices=list(groupers.keys()), default='none', help="group samples by given
>> criteria")
>>      group.add_argument("--function", action='store', help="use given
>> function as tree root")
>>      group.add_argument("--min-duration", action='store', help="show only
>> nodes with resident time not shorter than this, eg: 200ms")
>>      add_backtrace_options(group)
>> @@ -236,7 +237,7 @@ def show_profile(args, sample_producer):
>>
>>      def node_filter(*args):
>>          for filter in node_filters:
>> -            if not filter(*args):
>> +            if not list(filter(*args)):
>>                  return False
>>          return True
>>
>> @@ -276,7 +277,7 @@ def extract(args):
>>              stderr=subprocess.STDOUT)
>>          _stdout, _ = proc.communicate()
>>          if proc.returncode or not os.path.exists(args.tracefile):
>> -            print(_stdout)
>> +            print(_stdout.decode())
>>              sys.exit(1)
>>      else:
>>          print("error: %s not found" % (elf_path))
>> @@ -332,8 +333,10 @@ def write_sample_to_pcap(sample, pcap_writer):
>>          }
>>
>>          pkt = dpkt.ethernet.Ethernet()
>> -        pkt.data = sample.data[1]
>>          pkt.type = eth_types[proto]
>> +        pkt.src = b''
>> +        pkt.dst = b''
>> +        pkt.data = sample.data[1]
>>          pcap_writer.writepkt(pkt, ts=ts)
>>
>>  def format_packet_sample(sample):
>> @@ -343,7 +346,7 @@ def format_packet_sample(sample):
>>      pcap = dpkt.pcap.Writer(proc.stdin)
>>      write_sample_to_pcap(sample, pcap)
>>      pcap.close()
>> -    assert(proc.stdout.readline() == "reading from file -, link-type
>> EN10MB (Ethernet)\n")
>> +    assert(proc.stdout.readline().decode() == "reading from file -,
>> link-type EN10MB (Ethernet)\n")
>>      packet_line = proc.stdout.readline().rstrip()
>>      proc.wait()
>>      return packet_line
>> @@ -361,7 +364,7 @@ def pcap_dump(args, target=None):
>>      needs_dpkt()
>>
>>      if not target:
>> -        target = sys.stdout
>> +        target = sys.stdout.buffer
>>
>>      pcap_file = dpkt.pcap.Writer(target)
>>      try:
>> @@ -439,7 +442,10 @@ def print_summary(args, printer=sys.stdout.write):
>>                  else:
>>                      min_time = min(min_time, t.time)
>>
>> -                max_time = max(max_time, t.time)
>> +                if not max_time:
>> +                    max_time = t.time
>> +                else:
>> +                    max_time = max(max_time, t.time)
>>
>>              if args.timed:
>>                  timed = timed_producer(t)
>> @@ -450,42 +456,42 @@ def print_summary(args, printer=sys.stdout.write):
>>          timed_samples.extend((timed_producer.finish()))
>>
>>      if count == 0:
>> -        print "No samples"
>> +        print("No samples")
>>          return
>>
>> -    print "Collected %d samples spanning %s" % (count,
>> prof.format_time(max_time - min_time))
>> +    print("Collected %d samples spanning %s" % (count,
>> prof.format_time(max_time - min_time)))
>>
>> -    print "\nTime ranges:\n"
>> -    for cpu, r in sorted(cpu_time_ranges.items(), key=lambda (c, r):
>> r.min):
>> -        print "  CPU 0x%02d: %s - %s = %10s" % (cpu,
>> +    print("\nTime ranges:\n")
>> +    for cpu, r in sorted(list(cpu_time_ranges.items()), key=lambda c_r:
>> c_r[1].min):
>> +        print("  CPU 0x%02d: %s - %s = %10s" % (cpu,
>>              trace.format_time(r.min),
>>              trace.format_time(r.max),
>> -            prof.format_time(r.max - r.min))
>> +            prof.format_time(r.max - r.min)))
>>
>> -    max_name_len = reduce(max, map(lambda tp: len(tp.name),
>> count_per_tp.iterkeys()))
>> +    max_name_len = reduce(max, [len(tp.name) for tp in
>> iter(count_per_tp.keys())])
>>      format = "  %%-%ds %%8s" % (max_name_len)
>> -    print "\nTracepoint statistics:\n"
>> -    print format % ("name", "count")
>> -    print format % ("----", "-----")
>> +    print("\nTracepoint statistics:\n")
>> +    print(format % ("name", "count"))
>> +    print(format % ("----", "-----"))
>>
>> -    for tp, count in sorted(count_per_tp.iteritems(), key=lambda (tp,
>> count): tp.name):
>> -        print format % (tp.name, count)
>> +    for tp, count in sorted(iter(count_per_tp.items()), key=lambda
>> tp_count: tp_count[0].name):
>> +        print(format % (tp.name, count))
>>
>>      if args.timed:
>>          format = "  %-20s %8s %8s %8s %8s %8s %8s %8s %15s"
>> -        print "\nTimed tracepoints [ms]:\n"
>> +        print("\nTimed tracepoints [ms]:\n")
>>
>> -        timed_samples = filter(lambda t:
>> t.time_range.intersection(time_range), timed_samples)
>> +        timed_samples = [t for t in timed_samples if
>> t.time_range.intersection(time_range)]
>>
>>          if not timed_samples:
>> -            print "  None"
>> +            print("  None")
>>          else:
>> -            print format % ("name", "count", "min", "50%", "90%", "99%",
>> "99.9%", "max", "total")
>> -            print format % ("----", "-----", "---", "---", "---", "---",
>> "-----", "---", "-----")
>> +            print(format % ("name", "count", "min", "50%", "90%", "99%",
>> "99.9%", "max", "total"))
>> +            print(format % ("----", "-----", "---", "---", "---", "---",
>> "-----", "---", "-----"))
>>
>> -            for name, traces in
>> get_timed_traces_per_function(timed_samples).iteritems():
>> +            for name, traces in
>> get_timed_traces_per_function(timed_samples).items():
>>                  samples =
>> sorted(list((t.time_range.intersection(time_range).length() for t in
>> traces)))
>> -                print format % (
>> +                print(format % (
>>                      name,
>>                      len(samples),
>>                      format_duration(get_percentile(samples, 0)),
>> @@ -494,9 +500,9 @@ def print_summary(args, printer=sys.stdout.write):
>>                      format_duration(get_percentile(samples, 0.99)),
>>                      format_duration(get_percentile(samples, 0.999)),
>>                      format_duration(get_percentile(samples, 1)),
>> -                    format_duration(sum(samples)))
>> +                    format_duration(sum(samples))))
>>
>> -    print
>> +    print()
>>
>>  def list_cpu_load(args):
>>      load_per_cpu = {}
>> @@ -550,7 +556,7 @@ def list_timed(args):
>>
>>          for timed in timed_traces:
>>              t = timed.trace
>> -            print '0x%016x %-15s %2d %20s %7s %-20s %s%s' % (
>> +            print('0x%016x %-15s %2d %20s %7s %-20s %s%s' % (
>>                              t.thread.ptr,
>>                              t.thread.name,
>>                              t.cpu,
>> @@ -558,7 +564,7 @@ def list_timed(args):
>>                              trace.format_duration(timed.duration),
>>                              t.name,
>>                              trace.Trace.format_data(t),
>> -                            bt_formatter(t.backtrace))
>> +                            bt_formatter(t.backtrace)))
>>
>>  def list_wakeup_latency(args):
>>      bt_formatter = get_backtrace_formatter(args)
>> @@ -575,9 +581,9 @@ def list_wakeup_latency(args):
>>          return "%4.6f" % (float(nanos) / 1e6)
>>
>>      if not args.no_header:
>> -        print '%-18s %-15s %3s %20s %13s %9s %s' % (
>> +        print('%-18s %-15s %3s %20s %13s %9s %s' % (
>>              "THREAD", "THREAD-NAME", "CPU", "TIMESTAMP[s]",
>> "WAKEUP[ms]", "WAIT[ms]", "BACKTRACE"
>> -        )
>> +        ))
>>
>>      with get_trace_reader(args) as reader:
>>          for t in reader.get_traces():
>> @@ -594,14 +600,14 @@ def list_wakeup_latency(args):
>>                      if t.cpu == waiting_thread.wait.cpu:
>>
>> ...
>
> --
> You received this message because you are subscribed to the Google Groups
> "OSv Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/osv-dev/08b1c328-74fa-43a3-8a22-02cfe051a4e6%40googlegroups.com
> <https://groups.google.com/d/msgid/osv-dev/08b1c328-74fa-43a3-8a22-02cfe051a4e6%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/CANEVyjtwAp004o6dGB6QBsRz_CQ8PdVGDMiBkyiutayDGv7bGA%40mail.gmail.com.

Re: [osv-dev] [PATCH] scripts: upgrade to python 3 - part 1

Reply via email to