On Sun, Feb 09, 2020 at 07:01:05PM -0700, Sean Whitton wrote:
> One key problem with the current workflow is that it makes it very
> difficult to avoid reviewing identical files more than once.  That would
> be a big improvement.

(I was just talking with Michael about this several minutes ago.)

Just leaking a part of my WIP work.

My core data structure looks like this

  {path: [hash, stamp, username, status, annotation]}

The "hash " field is a salted hash, calculated like this

  hash(data=read(path), salt=read(neighbor_license()))

This data structure is a fine-grained (per-path level) "accept/reject"
record.  Each path is a node. The "status" of a tree can be
automatically computed form its decendant nodes.

When a package enters NEW again, files with matching hashes will
automatically reuse the last status assigned by human user, where status
= either "accept" or "reject".

There are still many other aspects from which I can reduce time
consumption for human and improve efficiency.

Reply via email to