Re: [osv-dev] [PATCH] scripts: upgrade to python 3 - part 1

Waldek Kozaczuk Tue, 18 Feb 2020 05:54:48 -0800

This - 
https://stackoverflow.com/questions/27476079/why-does-2to3-change-mydict-keys-to-listmydict-keys
 - 
might explain this "list" related changes by 2to3.


On Tuesday, February 18, 2020 at 8:42:47 AM UTC-5, Waldek Kozaczuk wrote:
>
> Hey,
>
> Thanks for the review.
>
> Let me first explain what my process was. At first, I just tried to submit 
> Matt's changes with proper signoff. But then I decided to test it a bit and 
> discovered some things were breaking. Now I do not think it was because of 
> any problems in his original script, but mostly due to the fact that his 
> changes were intertwined with other scripts and I had to change those. 
> Also, pretty modules/openjdk8-from-host/module.py (was not there when Matt 
> was changing) did not work And finally trace.py somehow came along. So here 
> we go.
>
> As far as mechanics goes, I retained most of the Matt's patch as is and I 
> believe he used the "Future" module. I used the 2to3 tool (more 
> specifically 2to3-2.7 in my case) first and then manually fixed any 
> problems I discovered. Most painful to fix was trace.py and all related 
> ones (scripts/osv/*.py) which I tested quite thoroughly by running trace.py 
> (I learned more about it which is great but probably way too much than what 
> I wanted to know at this point ;-)). I also tested all other tests more 
> implicitly by running scripts/build with some variations of image= (java is 
> good as it exercises a lot) and fs=.
>
> But you are right that I could not thoroughly test loader.py (did many 
> commands but not all). Shall we exclude it from this patch?
>
> Lastly, I am probably not the right person to do this upgrade exercise. I 
> do not use Python daily so I am not an expert. Lack of compiler (aka 
> 'complainer') did not make me very confident especially with those larger 
> scripts. But I did not want Matt's work to go to waste. So here we go :-)
>
> On Tuesday, February 18, 2020 at 5:34:58 AM UTC-5, Nadav Har'El wrote:
>
> Thanks. I commented with a few concerns below. I'm only really worried 
> about loader.py - the gdb script, which was already supposed to be working 
> for both Python 2 and Python 3 (although we probably didn't test it 
> recently), and I'm worried these changes are breaking it, rather than 
> improving it - and it's difficult to test because these changes change all 
> sorts of "osv" commands in gdb which I guess you didn't test individually.
>
> Shall we leave it out? 
>
>
> I ask that you please at least try to run the affected scripts in Python3 
> before "fixing" them at all. In particular, some scripts that had 
> "/usr/bin/python" at the top (and not /usr/bin/python2) at the top already 
> worked correctly for Python 3 (and Python 2) because some people already 
> had python 3 as their default python (although, it is quite likely that we 
> haven't tested this in a while so things broke).
>
> In most cases, I first ran 2to3 and then applied manual changes if I found 
> any problems (like related to bytes vs str type of issue - really painful 
> to find and fix). Indeed 2to3 is somewhat suspicious as it sometimes put 
> extra parentheses when there were already (I manually removed those 
> changes). Not sure about list-like related changes - all done by 2to3.
>
>
> --
> Nadav Har'El
> n...@scylladb.com
>
>
> On Mon, Feb 17, 2020 at 8:22 AM Waldemar Kozaczuk <jwkoz...@gmail.com> 
> wrote:
>
> --- a/scripts/loader.py
> +++ b/scripts/loader.py
>
>
> Please make sure to test this file a bit, because it isn't part of the 
> standard build or run.
> This is the script loaded automatically when you use gdb OSv.
>
> Also, I was always under the impression that this script already worked on 
> both Python 2 and
> Python 3, because we had no idea what gdb runs for us (the "#!" in the top 
> of this script isn't relevant, right?)?
>
> Shall we change at least this '#!/usr/bin/python2' to '#!/usr/bin/python' 
> then?
>  
>
> @@ -1,4 +1,4 @@
> -#!/usr/bin/python2
> +#!/usr/bin/python3
>
>  import gdb
>  import re
> @@ -37,8 +37,8 @@ def phys_cast(addr, type):
>
>  def values(_dict):
>      if hasattr(_dict, 'viewvalues'):
> -        return _dict.viewvalues()
> -    return _dict.values()
> +        return _dict.values()
>
>
> This doesn't look right - you check if there is a viewvalues() function 
> and then call values()? 
> Maybe the whole "hasattr" check isn't needed on Python 2 any more?
>
> +    return list(_dict.values())
>
>
> I wonder why you need to convert it to a list - what's wrong with a 
> generator which isn't officially a list? Did this solve a real problem, or 
> some automatic converter suggested it?
>
> Changed by 2to3. 
>
>
>
>  def read_vector(v):
>      impl = v['_M_impl']
> @@ -426,19 +426,19 @@ class osv_zfs(gdb.Command):
>
>          print ("\n:: ARC SIZES ::")
>          print ("\tCurrent size:    %d (%d MB)" %
> -               (arc_size, arc_size / 1024 / 1024))
> +               (arc_size, arc_size // 1024 // 1024))
>          print ("\tTarget size:     %d (%d MB)" %
> -               (arc_target_size, arc_target_size / 1024 / 1024))
> +               (arc_target_size, arc_target_size // 1024 // 1024))
>          print ("\tMin target size: %d (%d MB)" %
> -               (arc_min_size, arc_min_size / 1024 / 1024))
> +               (arc_min_size, arc_min_size // 1024 // 1024))
>          print ("\tMax target size: %d (%d MB)" %
> -               (arc_max_size, arc_max_size / 1024 / 1024))
> +               (arc_max_size, arc_max_size // 1024 // 1024))
>
>          print ("\n:: ARC SIZE BREAKDOWN ::")
>          print ("\tMost recently used cache size:   %d (%d MB) (%.2f%%)" %
> -               (arc_mru_size, arc_mru_size / 1024 / 1024, arc_mru_perc))
> +               (arc_mru_size, arc_mru_size // 1024 // 1024, arc_mru_perc))
>          print ("\tMost frequently used cache size: %d (%d MB) (%.2f%%)" %
> -               (arc_mfu_size, arc_mfu_size / 1024 / 1024, arc_mfu_perc))
> +               (arc_mfu_size, arc_mfu_size // 1024 // 1024, arc_mfu_perc))
>
>          # Cache efficiency
>          arc_hits = get_stat_by_name(arc_stats_struct, arc_stats_cast, 
> 'arcstat_hits')
> @@ -618,7 +618,7 @@ class osv_mmap(gdb.Command):
>              end = ulong(vma['_range']['_end'])
>              flags = flagstr(ulong(vma['_flags']))
>              perm = permstr(ulong(vma['_perm']))
> -            size = '{:<16}'.format('[%s kB]' % (ulong(end - start)/1024))
> +            size = '{:<16}'.format('[%s kB]' % (ulong(end - start)//1024))
>
>              if 'F' in flags:
>                  file_vma = 
> vma.cast(gdb.lookup_type('mmu::file_vma').pointer())
> @@ -648,7 +648,7 @@ class osv_vma_find(gdb.Command):
>                  if start <= addr and end > addr:
>                      flags = flagstr(ulong(vma['_flags']))
>                      perm = permstr(ulong(vma['_perm']))
> -                    size = '{:<16}'.format('[%s kB]' % (ulong(end - 
> start)/1024))
> +                    size = '{:<16}'.format('[%s kB]' % (ulong(end - 
> start)//1024))
>                      print('0x%016x -> vma 0x%016x' % (addr, vma_addr))
>                      print('0x%016x 0x%016x %s flags=%s perm=%s' % (start, 
> end, size, flags, perm))
>                      break
> @@ -671,7 +671,7 @@ def ulong(x):
>  def to_int(gdb_value):
>      if hasattr(globals()['__builtins__'], 'long'):
>          # For GDB with python2
> -        return long(gdb_value)
> +        return int(gdb_value)
>
>
> Again, this change is wrong, was it automated?
> The whole point of this code was to support *both* python 2
> and python 3, and now we go breaking the old python 2 code,
> but not even removing the if :-)
>
>      return int(gdb_value)
>
>  class osv_syms(gdb.Command):
> @@ -751,7 +751,7 @@ def get_base_class_offset(gdb_type, base_class_name):
>      name_pattern = re.escape(base_class_name) + "(<.*>)?$"
>      for field in gdb_type.fields():
>          if field.is_base_class and re.match(name_pattern, field.name):
> -            return field.bitpos / 8
> +            return field.bitpos // 8
>
>  def derived_from(type, base_class):
>      return len([x for x in type.fields()
> @@ -808,11 +808,8 @@ class intrusive_list:
>              yield node_ptr.cast(self.node_type.pointer()).dereference()
>              hook = hook['next_']
>
> -    def __nonzero__(self):
> -        return self.root['next_'] != self.root.address
> -
>
>
> Any reason to remove this? (I don't know why it was needed, but why 
> remove?)
>
> Changed by 2to3. 
>
>
>      def __bool__(self):
> -        return self.__nonzero__()
> +        return self.root['next_'] != self.root.address
>
>  class vmstate(object):
>      def __init__(self):
> @@ -832,7 +829,7 @@ class vmstate(object):
>          self.cpu_list = cpu_list
>
>      def load_thread_list(self):
> -        threads = map(gdb.Value.dereference, 
> unordered_map(gdb.lookup_global_symbol('sched::thread_map').value()))
> +        threads = list(map(gdb.Value.dereference, 
> unordered_map(gdb.lookup_global_symbol('sched::thread_map').value())))
>
>
> I don't understand why this change is needed... sorted() should work just 
> fine on an interable, it doesn't need a list as far as I know.
> Did something now work without this change?
>
> Changed by 2to3. 
>
>
>          self.thread_list = sorted(threads, key=lambda x: int(x["_id"]))
>
>      def cpu_from_thread(self, thread):
> @@ -896,7 +893,7 @@ def show_thread_timers(t):
>          gdb.write('  timers:')
>          for timer in timer_list:
>              expired = '*' if timer['_state'] == timer_state_expired else 
> ''
> -            expiration = int(timer['_time']['__d']['__r']) / 1.0e9
> +            expiration = int(timer['_time']['__d']['__r']) // 1.0e9
>              gdb.write(' %11.9f%s' % (expiration, expired))
>          gdb.write('\n')
>
> @@ -911,7 +908,7 @@ class ResolvedFrame:
>          self.frame = frame
>          self.file_name = file_name
>          self.line = line
> -        self.func_name = func_name
> +        self.__name__ = func_name
>
>
> Why rename this field?? 
>
> Changed by 2to3. 
>
>
>
>  def traverse_resolved_frames(frame):
>      while frame:
> @@ -989,14 +986,14 @@ class osv_info_threads(gdb.Command):
>                  function_whitelist = [sched_thread_join]
>
>                  def is_interesting(resolved_frame):
> -                    is_whitelisted = resolved_frame.func_name in 
> function_whitelist
> +                    is_whitelisted = resolved_frame.__name__ in 
> function_whitelist
>                      is_blacklisted = 
> os.path.basename(resolved_frame.file_name) in file_blacklist
>                      return is_whitelisted or not is_blacklisted
>
>                  fr = find_or_give_last(is_interesting, 
> traverse_resolved_frames(newest_frame))
>
>                  if fr:
> -                    location = '%s at %s:%s' % (fr.func_name, 
> strip_dotdot(fr.file_name), fr.line)
> +                    location = '%s at %s:%s' % (fr.__name__, 
> strip_dotdot(fr.file_name), fr.line)
>                  else:
>                      location = '??'
>
> @@ -1009,7 +1006,7 @@ class osv_info_threads(gdb.Command):
>                             )
>                            )
>
> -                if fr and fr.func_name == sched_thread_join:
> +                if fr and fr.__name__ == sched_thread_join:
>                      gdb.write("\tjoining on %s\n" % 
> fr.frame.read_var("this"))
>
>                  show_thread_timers(t)
> @@ -1176,6 +1173,7 @@ def all_traces():
>          max_trace = ulong(trace_buffer['_size'])
>
>          if not trace_log_base:
> +            print('!!! Could not find any trace data! Make sure "--trace" 
> option matches some tracepoints.')
>              raise StopIteration
>
>          trace_log = inf.read_memory(trace_log_base, max_trace)
> @@ -1214,7 +1212,7 @@ def all_traces():
>              unpacker.align_up(8)
>              yield Trace(tp, Thread(thread, thread_name), time, cpu, data, 
> backtrace=backtrace)
>
> -    iters = map(lambda cpu: one_cpu_trace(cpu), values(state.cpu_list))
> +    iters = [one_cpu_trace(cpu) for cpu in values(state.cpu_list)]
>
>      return heapq.merge(*iters)
>
>  def save_traces_to_file(filename):
> @@ -1281,7 +1279,7 @@ def show_leak():
>      gdb.flush()
>      allocs = []
>      for i in range(size_allocations):
> -        newpercent = '%2d%%' % round(100.0*i/(size_allocations-1))
> +        newpercent = '%2d%%' % round(100.0*i//(size_allocations-1))
>
>
> It seems like *all* divisions in this file were converted to integer 
> divisions. Are you sure this was the intent in all of them?
> Here in particular, it's pretty clear by the 100.0 that floating point 
> division was intended!
>
> Changed by 2to3. But you are right this does not look correct. 
>
>
>  
>
>          if newpercent != percent:
>              percent = newpercent
>              gdb.write('\b\b\b%s' % newpercent)
> @@ -1343,10 +1341,10 @@ def show_leak():
>                     allocations=cur_n,
>                     minsize=cur_min_size,
>                     maxsize=cur_max_size,
> -                   avgsize=cur_total_size/cur_n,
> +                   avgsize=cur_total_size//cur_n,
>
>                     minbirth=cur_first_seq,
>                     maxbirth=cur_last_seq,
> -                   avgbirth=cur_total_seq/cur_n,
> +                   avgbirth=cur_total_seq//cur_n,
>                     callchain=callchain)
>          records.append(r)
>          cur_n = 0
> @@ -1538,7 +1536,7 @@ class osv_percpu(gdb.Command):
>              gdb.write('%s\n'%e)
>              return
>          percpu_addr = percpu.address
> -        for cpu in vmstate().cpu_list.values():
> +        for cpu in list(vmstate().cpu_list.values()):
>              gdb.write("CPU %d:\n" % cpu.id)
>              base = cpu.obj['percpu_base']
>              addr = base+to_int(percpu_addr)
>
>  
>
>
> diff --git a/scripts/module.py b/scripts/module.py
> index 548dd0c7..c253968c 100755
> --- a/scripts/module.py
> +++ b/scripts/module.py
> @@ -1,4 +1,4 @@
> -#!/usr/bin/python
> +#!/usr/bin/python3
>
>  import re
>  import os
> @@ -226,7 +226,7 @@ def build(args):
>          else:
>              print(prefix)
>
> -    for module, run_config_name in modules_to_run.items():
> +    for module, run_config_name in list(modules_to_run.items()):
>
>
> This smells like an automated change - I don't see any reason why this 
> list is needed!
>
> Can you please review all these changes? Better yet, just run the code 
> with python3 and see if anything *really* doesn't work...
>
> Will do. 
>
>
>  
>
> diff --git a/scripts/osv/prof.py b/scripts/osv/prof.py
> index 95db15b8..465a223f 100644
> --- a/scripts/osv/prof.py
> +++ b/scripts/osv/prof.py
> @@ -51,7 +51,7 @@ time_units = [
>  ]
>
>  def parse_time_as_nanos(text, default_unit='ns'):
> -    for level, name in sorted(time_units, key=lambda (level, name): 
> -len(name)):
> +    for level, name in sorted(time_units, key=lambda level_name: 
> -len(level_name[1])):
>
>
> I don't understand the changes in this file, they don't look like simple 
> syntax changes... Did you review them?
>
> 2to3.  I may have not tested trace.py with min_duration argument.
>
>
>          if text.endswith(name):
>              return float(text.rstrip(name)) * level
>      for level, name in time_units:
> @@ -60,7 +60,7 @@ def parse_time_as_nanos(text, default_unit='ns'):
>      raise Exception('Unknown unit: ' + default_unit)
>
>  def format_time(time, format="%.2f %s"):
> -    for level, name in sorted(time_units, key=lambda (level, name): 
> -level):
> +    for level, name in sorted(time_units, key=lambda level_name1: 
> -level_name1[0]):
>          if time >= level:
>              return format % (float(time) / level, name)
>      return str(time)
> @@ -207,10 +207,16 @@ class timed_trace_producer(object):
>          self.last_time = None
>
>      def __call__(self, sample):
> +        if not sample.time:
> +            return
> +
>          if not sample.cpu in self.earliest_trace_per_cpu:
>              self.earliest_trace_per_cpu[sample.cpu] = sample
>
> -        self.last_time = max(self.last_time, sample.time)
> +        if not self.last_time:
> +            self.last_time = sample.time
> +        else:
> +            self.last_time = max(self.last_time, sample.time)
>
>          matcher = self.matcher_by_name.get(sample.name, None)
>          if not matcher:
> @@ -239,7 +245,7 @@ class timed_trace_producer(object):
>              return trace.TimedTrace(entry_trace, duration)
>
>      def finish(self):
> -        for sample in self.open_samples.itervalues():
> +        for sample in self.open_samples.values():
>              duration = self.last_time - sample.time
>              yield trace.TimedTrace(sample, duration)
>
> @@ -275,7 +281,7 @@ def get_idle_profile(traces):
>
>      def trim_samples(cpu, end_time):
>          if cpu.idle:
> -            for w in cpu.waits.values():
> +            for w in list(cpu.waits.values()):
>                  begin = max(w.time, cpu.idle.time)
>                  yield ProfSample(begin, w.cpu, w.thread, w.backtrace, 
> resident_time=end_time - begin)
>
> @@ -295,7 +301,7 @@ def get_idle_profile(traces):
>
>          last = t
>
> -    for cpu in cpus.values():
> +    for cpu in list(cpus.values()):
>          for s in trim_samples(cpu, t.time):
>              yield s
>
> @@ -402,7 +408,7 @@ def print_profile(samples, symbol_resolver, 
> caller_oriented=False,
>      if not order:
>          order = lambda node: (-node.resident_time, -node.hit_count)
>
> -    for group, tree_root in sorted(groups.iteritems(), key=lambda 
> (thread, node): order(node)):
> +    for group, tree_root in sorted(iter(groups.items()), key=lambda 
> thread_node: order(thread_node[1])):
>          collapse_similar(tree_root)
>
>          if max_levels:
> +++ b/scripts/osv/trace.py
> @@ -65,7 +65,12 @@ class TimeRange(object):
>          return self.end - self.begin
>
>      def intersection(self, other):
> -        begin = max(self.begin, other.begin)
> +        if not self.begin:
> +            begin = other.begin
> +        elif not other.begin:
> +            begin = self.begin
> +        else:
> +            begin = max(self.begin, other.begin)
>
>          if self.end is None:
>              end = other.end
> @@ -143,11 +148,11 @@ class Trace:
>  class TimedTrace:
>      def __init__(self, trace, duration=None):
>          self.trace = trace
> -        self.duration = duration
> +        self.duration_ = duration
>
>      @property
>      def duration(self):
> -        return self.duration
> +        return self.duration_
>
>      @property
>      def time(self):
> @@ -183,6 +188,8 @@ def do_split_format(format_str):
>
>  _split_cache = {}
>  def split_format(format_str):
> +    if not format_str:
> +        return []
>      result = _split_cache.get(format_str, None)
>      if not result:
>          result = list(do_split_format(format_str))
> @@ -190,7 +197,7 @@ def split_format(format_str):
>      return result
>
>  formatters = {
> -    '*': lambda bytes: '{' + ' '.join('%02x' % ord(b) for b in bytes) + 
> '}'
> +    '*': lambda bytes: '{' + ' '.join('%02x' % b for b in bytes) + '}'
>
>
> Hmm, doesn't ord(b) work in Python3 any more?
>
>
> Python 2:
> >>> f = lambda bytes: '{' + ' '.join('%02x' % ord(b) for b in bytes) + '}'
> >>> f(b'test')
> '{74 65 73 74}'
>
> Python 3:
> >>> f = lambda bytes: '{' + ' '.join('%02x' % ord(b) for b in bytes) + '}'
> >>> f(b'test')
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "<stdin>", line 1, in <lambda>
>   File "<stdin>", line 1, in <genexpr>
> TypeError: ord() expected string of length 1, but int found
>
> f2 = lambda bytes: '{' + ' '.join('%02x' % b for b in bytes) + '}'
> >>> f2(b'test')
> '{74 65 73 74}'
>
> The ord still works but in the end this lambda breaks.
>
>
>  }
>
>  def get_alignment_of(fmt):
> @@ -238,16 +245,15 @@ class SlidingUnpacker:
>                  size = struct.calcsize(fmt)
>                  val, = struct.unpack_from(fmt, 
> self.buffer[self.offset:self.offset+size])
>                  self.offset += size
> -                values.append(val)
> +                if fmt.startswith('50p'):
> +                   values.append(val.decode('utf-8'))
> +                else:
> +                   values.append(val)
>
>          return tuple(values)
>
> -    def __nonzero__(self):
> -        return self.offset < len(self.buffer)
> -
> -    # Python3
>      def __bool__(self):
> -        return self.__nonzero__()
> +        return self.offset < len(self.buffer)
>
>  class WritingPacker:
>      def __init__(self, writer):
> @@ -270,7 +276,10 @@ class WritingPacker:
>              if fmt == '*':
>                  self.pack_blob(arg)
>              else:
> -                self.writer(struct.pack(fmt, arg))
> +                if fmt == '50p':
> +                    self.writer(struct.pack(fmt, arg.encode('utf-8')))
> +                else:
> +                    self.writer(struct.pack(fmt, arg))
>                  self.offset += struct.calcsize(fmt)
>
>      def pack_blob(self, arg):
> @@ -298,7 +307,7 @@ class TraceDumpReaderBase :
>          self.endian = '<'
>          self.file = open(filename, 'rb')
>          try:
> -            tag = self.file.read(4)
> +            tag = self.file.read(4).decode()
>              if tag == "OSVT":
>                  endian = '>'
>              elif tag != "TVSO":
> @@ -347,7 +356,7 @@ class TraceDumpReaderBase :
>
>      def readString(self):
>          len = self.read('H')
> -        return self.file.read(len)
> +        return self.file.read(len).decode()
>
>  class TraceDumpReader(TraceDumpReaderBase) :
>      def __init__(self, filename):
> @@ -378,7 +387,7 @@ class TraceDumpReader(TraceDumpReaderBase) :
>              sig = ""
>              for j in range(0, n_args):
>                  arg_name = self.readString()
> -                arg_sig = self.file.read(1)
> +                arg_sig = self.file.read(1).decode()
>                  if arg_sig == 'p':
>                      arg_sig = '50p'
>                  sig += arg_sig
> @@ -405,7 +414,7 @@ class TraceDumpReader(TraceDumpReaderBase) :
>
>              backtrace = None
>              if flags & 1:
> -                backtrace = filter(None, unpacker.unpack('Q' * 
> self.backtrace_len))
> +                backtrace = [_f for _f in unpacker.unpack('Q' * 
> self.backtrace_len) if _f]
>
>              data = unpacker.unpack(tp.signature)
>              unpacker.align_up(8)
> @@ -414,7 +423,7 @@ class TraceDumpReader(TraceDumpReaderBase) :
>              yield last_trace
>
>      def traces(self):
> -        iters = map(lambda data: self.oneTrace(data), self.trace_buffers)
> +        iters = [self.oneTrace(data) for data in self.trace_buffers]
>          return heapq.merge(*iters)
>
>
> @@ -523,7 +532,7 @@ def read(buffer_view):
>
>      while unpacker:
>          tp_key, thread_ptr, thread_name, time, cpu = 
> unpacker.unpack('QQ16sQI')
> -        thread_name = thread_name.rstrip('\0')
> +        thread_name = thread_name.rstrip(b'\0').decode('utf-8')
>          tp = tracepoints[tp_key]
>
>          backtrace = []
> @@ -551,7 +560,7 @@ def write(traces, writer):
>                      trace.time, trace.cpu)
>
>          if trace.backtrace:
> -            for frame in filter(None, trace.backtrace):
> +            for frame in [_f for _f in trace.backtrace if _f]:
>                  packer.pack('Q', frame)
>          packer.pack('Q', 0)
>
> diff --git a/scripts/osv/tree.py b/scripts/osv/tree.py
> index 594b00e2..86345157 100644
> --- a/scripts/osv/tree.py
> +++ b/scripts/osv/tree.py
> @@ -18,11 +18,11 @@ class TreeNode(object):
>
>      def squash_child(self):
>          assert self.has_only_one_child()
> -        self.children_by_key = 
> next(self.children_by_key.itervalues()).children_by_key
> +        self.children_by_key = 
> next(iter(self.children_by_key.values())).children_by_key
>
>      @property
>      def children(self):
> -        return self.children_by_key.itervalues()
> +        return iter(self.children_by_key.values())
>
>      def has_only_one_child(self):
>          return len(self.children_by_key) == 1
> diff --git a/scripts/run.py b/scripts/run.py
> index 9ab8d86f..f4452345 100755
> --- a/scripts/run.py
> +++ b/scripts/run.py
> @@ -1,5 +1,5 @@
> -#!/usr/bin/env python
> -from __future__ import print_function
> +#!/usr/bin/env python3
> +
>  import subprocess
>  import sys
>  import argparse
> diff --git a/scripts/setup.py b/scripts/setup.py
> index 2a7f1c18..a58132ab 100755
> --- a/scripts/setup.py
> +++ b/scripts/setup.py
> @@ -1,8 +1,8 @@
> -#!/usr/bin/python2
> +#!/usr/bin/python3
>
>  # set up a development environment for OSv.  Run as root.
>
> -import sys, platform, argparse
> +import sys, distro, argparse
>  import subprocess
>
>  standard_ec2_packages = ['python-pip', 'wget']
> @@ -319,11 +319,11 @@ parser.add_argument("-t", "--test", 
> action="store_true",
>                      help="install packages required by testing tools")
>  cmdargs = parser.parse_args()
>
> -(name, version, id) = platform.linux_distribution()
> +(name, version, id) = distro.linux_distribution()
>
>  for distro in distros:
>      if type(distro.name) == type([]):
> -        dname = filter(lambda n: name.startswith(n), distro.name)
> +        dname = [n for n in distro.name if name.startswith(n)]
>          if len(dname):
>              distro.name = dname[0]
>          else:
> @@ -349,5 +349,5 @@ for distro in distros:
>          print ('Your distribution %s version %s is not supported by this 
> script' % (name, version))
>          sys.exit(1)
>
> -print 'Your distribution is not supported by this script.'
> +print('Your distribution is not supported by this script.')
>  sys.exit(2)
> diff --git a/scripts/test.py b/scripts/test.py
> index 02eb4b55..0f9c07c1 100755
> --- a/scripts/test.py
> +++ b/scripts/test.py
> @@ -1,4 +1,4 @@
> -#!/usr/bin/env python
> +#!/usr/bin/env python3
>  import atexit
>  import subprocess
>  import argparse
> @@ -81,7 +81,7 @@ def is_not_skipped(test):
>      return test.name not in blacklist
>
>  def run_tests_in_single_instance():
> -    run(filter(lambda test: not isinstance(test, TestRunnerTest), tests))
> +    run([test for test in tests if not isinstance(test, TestRunnerTest)])
>
>      blacklist_tests = ' '.join(blacklist)
>      args = run_py_args + ["-s", "-e", "/testrunner.so -b %s" % 
> (blacklist_tests)]
> @@ -103,7 +103,7 @@ def pluralize(word, count):
>
>  def make_export_and_conf():
>      export_dir = tempfile.mkdtemp(prefix='share')
> -    os.chmod(export_dir, 0777)
> +    os.chmod(export_dir, 0o777)
>      (conf_fd, conf_path) = tempfile.mkstemp(prefix='export')
>      conf = os.fdopen(conf_fd, "w")
>      conf.write("%s 127.0.0.1(insecure,rw)\n" % export_dir)
> @@ -155,12 +155,12 @@ def run_tests():
>              "/tst-nfs.so --server 192.168.122.1 --share %s" %
>              export_dir) ]
>
> -        line = proc.stdout.readline()
> +        line = proc.stdout.readline().decode()
>          while line:
>               print(line)
>               if "/tmp" in line:
>                  break
> -             line = proc.stdout.readline()
> +             line = proc.stdout.readline().decode()
>
>
>          run(tests_to_run)
> diff --git a/scripts/tests/test_app.py b/scripts/tests/test_app.py
> index 27112ff7..1738e755 100755
> --- a/scripts/tests/test_app.py
> +++ b/scripts/tests/test_app.py
> @@ -1,7 +1,6 @@
> -#!/usr/bin/python
> +#!/usr/bin/python3
>  from testing import *
>  import argparse
> -import subprocess
>  from time import sleep
>
>  def run(command, hypervisor_name, image_path=None, line=None, 
> guest_port=None, host_port=None, input_lines=[], kill_app=False):
> diff --git a/scripts/tests/test_app_with_test_script.py 
> b/scripts/tests/test_app_with_test_script.py
> index 2ab1b731..8a2c7295 100755
> --- a/scripts/tests/test_app_with_test_script.py
> +++ b/scripts/tests/test_app_with_test_script.py
> @@ -1,4 +1,4 @@
> -#!/usr/bin/python
> +#!/usr/bin/python3
>  from testing import *
>  import argparse
>  import runpy
> diff --git a/scripts/tests/test_http_app_with_curl_and_ab.py 
> b/scripts/tests/test_http_app_with_curl_and_ab.py
> index 067fcc83..9d613d01 100755
> --- a/scripts/tests/test_http_app_with_curl_and_ab.py
> +++ b/scripts/tests/test_http_app_with_curl_and_ab.py
> @@ -1,11 +1,11 @@
> -#!/usr/bin/python
> +#!/usr/bin/python3
>  from testing import *
>  import argparse
>  import subprocess
>  from time import sleep
>
>  def check_with_curl(url, expected_http_line):
> -    output = subprocess.check_output(["curl", "-s", url])
> +    output = subprocess.check_output(["curl", "-s", url]).decode('utf-8')
>      print(output)
>      if expected_http_line not in output:
>         print("FAILED curl: wrong output")
> @@ -39,9 +39,9 @@ def run(command, hypervisor_name, host_port, guest_port, 
> http_path, expected_htt
>          check_with_curl(app_url, expected_http_line)
>
>      if no_keep_alive:
> -        output = subprocess.check_output(["ab", "-l", "-c", 
> str(concurrency), "-n", str(count), app_url]).split('\n')
> +        output = subprocess.check_output(["ab", "-l", "-c", 
> str(concurrency), "-n", str(count), app_url]).decode('utf-8').split('\n')
>      else:
> -        output = subprocess.check_output(["ab", "-l", "-k", "-c", 
> str(concurrency), "-n", str(count), app_url]).split('\n')
> +        output = subprocess.check_output(["ab", "-l", "-k", "-c", 
> str(concurrency), "-n", str(count), app_url]).decode('utf-8').split('\n')
>
>      failed_requests = 1
>      complete_requests = 0
> @@ -74,11 +74,11 @@ def run(command, hypervisor_name, host_port, 
> guest_port, http_path, expected_htt
>              success = False
>
>      if failed_requests > 0:
> -        print("FAILED ab - encountered failed requests: %d" % 
> failed_requests) 
> +        print("FAILED ab - encountered failed requests: %d" % 
> failed_requests)
>          success = False
>
>      if complete_requests < count:
> -        print("FAILED ab - too few complete requests : %d ? %d" % 
> (complete_requests, count)) 
> +        print("FAILED ab - too few complete requests : %d ? %d" % 
> (complete_requests, count))
>          success = False
>
>      if success:
> diff --git a/scripts/tests/testing.py b/scripts/tests/testing.py
> index c5249753..c3aaf218 100644
> --- a/scripts/tests/testing.py
> +++ b/scripts/tests/testing.py
> @@ -133,7 +133,7 @@ class SupervisedProcess:
>              self.cv.release()
>
>          line = ''
> -        ch_bytes = ''
> +        ch_bytes = bytes()
>          while True:
>              ch_bytes = ch_bytes + self.process.stdout.read(1)
>              try:
> @@ -144,7 +144,7 @@ class SupervisedProcess:
>                  if ch == '\n':
>                      append_line(line)
>                      line = ''
> -                ch_bytes = ''
> +                ch_bytes = bytes()
>              except UnicodeError:
>                  continue
>
> diff --git a/scripts/trace.py b/scripts/trace.py
> index 34cfb2ab..71dc47d5 100755
> --- a/scripts/trace.py
> +++ b/scripts/trace.py
> @@ -1,4 +1,4 @@
> -#!/usr/bin/env python2
> +#!/usr/bin/env python3
>  import sys
>  import errno
>  import argparse
> @@ -13,6 +13,7 @@ from collections import defaultdict
>  from osv import trace, debug, prof
>  from osv.client import Client
>  import memory_analyzer
> +from functools import reduce
>
>  class InvalidArgumentsException(Exception):
>      def __init__(self, message):
> @@ -114,7 +115,7 @@ def list_trace(args):
>      with get_trace_reader(args) as reader:
>          for t in reader.get_traces():
>              if t.time in time_range:
> -                print t.format(backtrace_formatter, 
> data_formatter=data_formatter)
> +                print(t.format(backtrace_formatter, 
> data_formatter=data_formatter))
>
>  def mem_analys(args):
>      mallocs = {}
> @@ -164,7 +165,7 @@ def add_profile_options(parser):
>      add_time_slicing_options(parser)
>      group = parser.add_argument_group('profile options')
>      group.add_argument("-r", "--caller-oriented", action='store_true', 
> help="change orientation to caller-based; reverses order of frames")
> -    group.add_argument("-g", "--group-by", choices=groupers.keys(), 
> default='none', help="group samples by given criteria")
> +    group.add_argument("-g", "--group-by", choices=list(groupers.keys()), 
> default='none', help="group samples by given criteria")
>      group.add_argument("--function", action='store', help="use given 
> function as tree root")
>      group.add_argument("--min-duration", action='store', help="show only 
> nodes with resident time not shorter than this, eg: 200ms")
>      add_backtrace_options(group)
> @@ -236,7 +237,7 @@ def show_profile(args, sample_producer):
>
>      def node_filter(*args):
>          for filter in node_filters:
> -            if not filter(*args):
> +            if not list(filter(*args)):
>                  return False
>          return True
>
> @@ -276,7 +277,7 @@ def extract(args):
>              stderr=subprocess.STDOUT)
>          _stdout, _ = proc.communicate()
>          if proc.returncode or not os.path.exists(args.tracefile):
> -            print(_stdout)
> +            print(_stdout.decode())
>              sys.exit(1)
>      else:
>          print("error: %s not found" % (elf_path))
> @@ -332,8 +333,10 @@ def write_sample_to_pcap(sample, pcap_writer):
>          }
>
>          pkt = dpkt.ethernet.Ethernet()
> -        pkt.data = sample.data[1]
>          pkt.type = eth_types[proto]
> +        pkt.src = b''
> +        pkt.dst = b''
> +        pkt.data = sample.data[1]
>          pcap_writer.writepkt(pkt, ts=ts)
>
>  def format_packet_sample(sample):
> @@ -343,7 +346,7 @@ def format_packet_sample(sample):
>      pcap = dpkt.pcap.Writer(proc.stdin)
>      write_sample_to_pcap(sample, pcap)
>      pcap.close()
> -    assert(proc.stdout.readline() == "reading from file -, link-type 
> EN10MB (Ethernet)\n")
> +    assert(proc.stdout.readline().decode() == "reading from file -, 
> link-type EN10MB (Ethernet)\n")
>      packet_line = proc.stdout.readline().rstrip()
>      proc.wait()
>      return packet_line
> @@ -361,7 +364,7 @@ def pcap_dump(args, target=None):
>      needs_dpkt()
>
>      if not target:
> -        target = sys.stdout
> +        target = sys.stdout.buffer
>
>      pcap_file = dpkt.pcap.Writer(target)
>      try:
> @@ -439,7 +442,10 @@ def print_summary(args, printer=sys.stdout.write):
>                  else:
>                      min_time = min(min_time, t.time)
>
> -                max_time = max(max_time, t.time)
> +                if not max_time:
> +                    max_time = t.time
> +                else:
> +                    max_time = max(max_time, t.time)
>
>              if args.timed:
>                  timed = timed_producer(t)
> @@ -450,42 +456,42 @@ def print_summary(args, printer=sys.stdout.write):
>          timed_samples.extend((timed_producer.finish()))
>
>      if count == 0:
> -        print "No samples"
> +        print("No samples")
>          return
>
> -    print "Collected %d samples spanning %s" % (count, 
> prof.format_time(max_time - min_time))
> +    print("Collected %d samples spanning %s" % (count, 
> prof.format_time(max_time - min_time)))
>
> -    print "\nTime ranges:\n"
> -    for cpu, r in sorted(cpu_time_ranges.items(), key=lambda (c, r): 
> r.min):
> -        print "  CPU 0x%02d: %s - %s = %10s" % (cpu,
> +    print("\nTime ranges:\n")
> +    for cpu, r in sorted(list(cpu_time_ranges.items()), key=lambda c_r: 
> c_r[1].min):
> +        print("  CPU 0x%02d: %s - %s = %10s" % (cpu,
>              trace.format_time(r.min),
>              trace.format_time(r.max),
> -            prof.format_time(r.max - r.min))
> +            prof.format_time(r.max - r.min)))
>
> -    max_name_len = reduce(max, map(lambda tp: len(tp.name), 
> count_per_tp.iterkeys()))
> +    max_name_len = reduce(max, [len(tp.name) for tp in 
> iter(count_per_tp.keys())])
>      format = "  %%-%ds %%8s" % (max_name_len)
> -    print "\nTracepoint statistics:\n"
> -    print format % ("name", "count")
> -    print format % ("----", "-----")
> +    print("\nTracepoint statistics:\n")
> +    print(format % ("name", "count"))
> +    print(format % ("----", "-----"))
>
> -    for tp, count in sorted(count_per_tp.iteritems(), key=lambda (tp, 
> count): tp.name):
> -        print format % (tp.name, count)
> +    for tp, count in sorted(iter(count_per_tp.items()), key=lambda 
> tp_count: tp_count[0].name):
> +        print(format % (tp.name, count))
>
>      if args.timed:
>          format = "  %-20s %8s %8s %8s %8s %8s %8s %8s %15s"
> -        print "\nTimed tracepoints [ms]:\n"
> +        print("\nTimed tracepoints [ms]:\n")
>
> -        timed_samples = filter(lambda t: 
> t.time_range.intersection(time_range), timed_samples)
> +        timed_samples = [t for t in timed_samples if 
> t.time_range.intersection(time_range)]
>
>          if not timed_samples:
> -            print "  None"
> +            print("  None")
>          else:
> -            print format % ("name", "count", "min", "50%", "90%", "99%", 
> "99.9%", "max", "total")
> -            print format % ("----", "-----", "---", "---", "---", "---", 
> "-----", "---", "-----")
> +            print(format % ("name", "count", "min", "50%", "90%", "99%", 
> "99.9%", "max", "total"))
> +            print(format % ("----", "-----", "---", "---", "---", "---", 
> "-----", "---", "-----"))
>
> -            for name, traces in 
> get_timed_traces_per_function(timed_samples).iteritems():
> +            for name, traces in 
> get_timed_traces_per_function(timed_samples).items():
>                  samples = 
> sorted(list((t.time_range.intersection(time_range).length() for t in 
> traces)))
> -                print format % (
> +                print(format % (
>                      name,
>                      len(samples),
>                      format_duration(get_percentile(samples, 0)),
> @@ -494,9 +500,9 @@ def print_summary(args, printer=sys.stdout.write):
>                      format_duration(get_percentile(samples, 0.99)),
>                      format_duration(get_percentile(samples, 0.999)),
>                      format_duration(get_percentile(samples, 1)),
> -                    format_duration(sum(samples)))
> +                    format_duration(sum(samples))))
>
> -    print
> +    print()
>
>  def list_cpu_load(args):
>      load_per_cpu = {}
> @@ -550,7 +556,7 @@ def list_timed(args):
>
>          for timed in timed_traces:
>              t = timed.trace
> -            print '0x%016x %-15s %2d %20s %7s %-20s %s%s' % (
> +            print('0x%016x %-15s %2d %20s %7s %-20s %s%s' % (
>                              t.thread.ptr,
>                              t.thread.name,
>                              t.cpu,
> @@ -558,7 +564,7 @@ def list_timed(args):
>                              trace.format_duration(timed.duration),
>                              t.name,
>                              trace.Trace.format_data(t),
> -                            bt_formatter(t.backtrace))
> +                            bt_formatter(t.backtrace)))
>
>  def list_wakeup_latency(args):
>      bt_formatter = get_backtrace_formatter(args)
> @@ -575,9 +581,9 @@ def list_wakeup_latency(args):
>          return "%4.6f" % (float(nanos) / 1e6)
>
>      if not args.no_header:
> -        print '%-18s %-15s %3s %20s %13s %9s %s' % (
> +        print('%-18s %-15s %3s %20s %13s %9s %s' % (
>              "THREAD", "THREAD-NAME", "CPU", "TIMESTAMP[s]", "WAKEUP[ms]", 
> "WAIT[ms]", "BACKTRACE"
> -        )
> +        ))
>
>      with get_trace_reader(args) as reader:
>          for t in reader.get_traces():
> @@ -594,14 +600,14 @@ def list_wakeup_latency(args):
>                      if t.cpu == waiting_thread.wait.cpu:
>
> ...

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/08b1c328-74fa-43a3-8a22-02cfe051a4e6%40googlegroups.com.

Re: [osv-dev] [PATCH] scripts: upgrade to python 3 - part 1

Reply via email to