D3994: fastannotate: initial import from Facebook's hg-experimental

2018-08-20 Thread indygreg (Gregory Szorc)
indygreg accepted this revision.
indygreg added a comment.
This revision is now accepted and ready to land.


  I looked this over again. I still have a number of concerns around this 
functionality. It really needs to be cleaned up. Especially the wire protocol 
and UI bits. But I suppose that's what the `EXPERIMENTAL` label is for. As long 
as we're not committed to backwards compatibility, I think it is appropriate to 
land in core then iterate. But let's try to fix as much before the end of 4.8 
as possible to minimize impact for people who deploy this. The wire protocol 
and on-disk format compatibility issues should be top of our list.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D3994

To: durin42, #hg-reviewers, indygreg
Cc: indygreg, yuja, quark, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D3994: fastannotate: initial import from Facebook's hg-experimental

2018-08-09 Thread durin42 (Augie Fackler)
durin42 added a comment.


  Addressed comments as follow-ups, mostly via recording a big TODO.

INLINE COMMENTS

> indygreg wrote in protocol.py:246-250
> The 1st and 3rd setup items here should be in `extsetup()` since they modify 
> global state (I think).
> 
> Ideally we'd have a registrar for these things so unloaded extensions don't 
> have wire protocol modifications affect other repos in-process. But, yeah, 
> not easy.

This actually runs inside a reposetup() method, so I think it's already 
sufficiently paranoid.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D3994

To: durin42, #hg-reviewers
Cc: indygreg, yuja, quark, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D3994: fastannotate: initial import from Facebook's hg-experimental

2018-08-07 Thread indygreg (Gregory Szorc)
indygreg added subscribers: yuja, indygreg.
indygreg added a comment.


  I only looked at some of this code.
  
  Overall, my threshold for adding to core is low. I think we should land then 
iterate.
  
  I would **really** like to see functionality related to modifying the 
behavior of annotate live in core. I think it will be easier to comprehend when 
implemented that way. I envision someday having arguments passed to 
`annotate()` that control whether linelog is used, etc. We may end up inventing 
something like `diffopts` but for annotate operations.
  
  It might help to think of this extension as two features: a cache for linelog 
data and user-facing UI to use it. The former is the most important to move to 
core. Especially since I imagine we don't want an `hg fastannotate` command 
lingering around forever (we would want to consolidate with `hg annotate` I 
think).
  
  The most concerning part of this extension is the wire protocol 
functionality. It is not suitable to ship to end-users because of forward 
compatibility concerns.
  
  The wire protocol functionality is literally transferring files off disk with 
little to no regard for the format of those files. The wire protocol doesn't 
expose the version of the files on disk. And that's obviously bad for future 
compatibility. We need some kind of compatibility checking that the client is 
capable of understanding the data the server will send it. This could be as 
simple as the server capabilities string advertising the on-disk format and 
clients ensuring they can speak the server's format before attempting to fetch 
that data.
  
  Of course, there could be a mix of data versions on disk given the current 
simple storage mechanism. We probably want to use separate directories for each 
version of the cached data so we don't mix streams. Or at the very least the 
server needs to validate that on-disk data it is serving matches the expected 
version from the client. Otherwise e.g. a server could serve up an old file 
version. Or a downgraded server could serve up a new file version.
  
  I think the most robust way forward is to have version-specific capabilities, 
wire protocol commands, and on-disk locations. This leaves the least margin for 
error and yields the greatest flexibility for changing formats and exchange 
semantics later without breaking BC.

INLINE COMMENTS

> __init__.py:27
> +# forward, usually it is "master" or "@".
> +mainbranch = master
> +

`branch` here is probably not the correct terminology.

> __init__.py:41
> +# (default: fastannotate)
> +modes = fastannotate
> +

I would prefer if there were a separate boolean option for each mode. That way, 
individual .hg/hgrc files can override specific behavior without overriding 
global defaults. It is a more flexible config model. We can, of course, have 
some options imply others.

> __init__.py:44
> +# default format when no format flags are used (default: number)
> +defaultformat = changeset, user, date
> +

Should this be a template?

> __init__.py:57
> +# (default: True for remotefilelog repo, False otherwise)
> +client = True
> +

This should be a more description option, like `exchange-cache` or `cache-pull`.

> __init__.py:62-67
> +# share sshpeer with remotefilelog. this would allow fastannotate to peek
> +# into remotefilelog internals, and steal its sshpeer, or in the reversed
> +# direction: donate its sshpeer to remotefilelog. disable this if
> +# fastannotate and remotefilelog should not share a sshpeer when their
> +# endpoints are different and incompatible. (default: True)
> +clientsharepeer = True

Since remotefilelog is not part of core, we probably want to nuke this feature.

> __init__.py:176-177
> +
> +# fastannotate has its own locking, without depending on repo lock
> +localrepo.localrepository._wlockfreeprefix.add('fastannotate/')
> +

I think this wants to be in `extsetup()` since it mutates global module state.

Ideally it would be part of `reposetup()` since it only matters on repos that 
have this extension enabled.

(This comes into play in e.g. hgweb environments, where some repos may not have 
the extension enabled.)

> __init__.py:182
> +if client is None:
> +if util.safehasattr(repo, 'requirements'):
> +client = 'remotefilelog' in repo.requirements

Won't `repo.requirements` always be defined?

> context.py:759-772
> +@contextlib.contextmanager
> +def _lockflock(self):
> +"""the same as 'lock' but use flock instead of lockmod.lock, to avoid
> +creating temporary symlinks."""
> +import fcntl
> +lockpath = self.linelogpath
> +util.makedirs(os.path.dirname(lockpath))

This wants to move to a `utils` module somewhere.

> formatter.py:3
> +#
> +# format: defines the format used to output annotate result
> +#

I'm sure @yuja has opinions about this file.

> protocol.py:43
> +# output:
> +#   FILE := 

D3994: fastannotate: initial import from Facebook's hg-experimental

2018-08-02 Thread durin42 (Augie Fackler)
durin42 added a comment.


  In https://phab.mercurial-scm.org/D3994#62892, @quark wrote:
  
  > I'd also like to see C linelog benchmark data mentioned. The current commit 
message implies diff algorithm is the bottleneck. That's misleading.
  
  
  Please feel encouraged to contribute your own numbers here. I probably can't 
be bothered.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D3994

To: durin42, #hg-reviewers
Cc: quark, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D3994: fastannotate: initial import from Facebook's hg-experimental

2018-08-02 Thread quark (Jun Wu)
quark added a comment.


  I'd also like to see C linelog benchmark data mentioned. The current commit 
message implies diff algorithm is the bottleneck. That's misleading.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D3994

To: durin42, #hg-reviewers
Cc: quark, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D3994: fastannotate: initial import from Facebook's hg-experimental

2018-08-01 Thread quark (Jun Wu)
quark added a comment.


  I would mention in the commit message that building cache is much faster with 
linkrevcache prebuilt.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D3994

To: durin42, #hg-reviewers
Cc: quark, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel