Re: Odd httpheader=1024 required for Phabricator

2020-02-11 Thread Scott Palmer

> On Feb 11, 2020, at 4:49 PM, Emile Snyder  wrote:
> 
> 
> 
>> On Tue, Feb 11, 2020 at 1:34 PM Makarius  wrote:
>> ...
>> This patch is BC, but SSH clients shouldn't be using the removed
>> capabilities so there should be no impact.
>> 
>> 
>> What means "BC"?
> 
> I suspect "backwards compatible"?

Breaking Change?

___
Mercurial mailing list
Mercurial@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial


Re: Odd httpheader=1024 required for Phabricator

2020-02-11 Thread Makarius
On 12/02/2020 07:50, Augie Fackler wrote:
>>
>> A bisection over the hg repository yields the following relevant changeset:
>>
>> changeset:   30563:e118233172fe
>> user:Gregory Szorc 
>> date:Mon Nov 28 20:46:42 2016 -0800
>> files:   mercurial/wireproto.py tests/test-ssh-bundle1.t tests/test-ssh.t
>> description:
>> wireproto: only advertise HTTP-specific capabilities to HTTP peers (BC)
>>
>> What means "BC"?
> 
> "breaking change" or "behavior change" depending who you ask. :)
> 
>> Maybe Phabricator is a good reason to keep the full information?
> 
> Not especially. They're doing it wrong assuming that the --stdio protocol is 
> compatible with http, and we've told them that more than once and they've 
> responded...negatively. If someone wanted to try and figure out a way to have 
> phabricator do the right thing (invoke hg's WSGI application, probably via 
> CGI?) that would probably help. I'd certainly be open to mentoring someone 
> doing that work, but I can't justify spending time on it myself.
> 
>> Do you think you can refine that for a future release of Mercurial?
> 
> I'm not sure what you're asking for here...

A minimal change to resolve the situation, e.g. by some option or config to
force http over stdio.

Complex software products routinely depend on wrong assumptions.

De-facto we have a situation that Phabricator requires a very old version of
Mercurial, and thus makes Mercurial look bad.


Makarius
___
Mercurial mailing list
Mercurial@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial


Re: Odd httpheader=1024 required for Phabricator

2020-02-11 Thread Augie Fackler


> On Feb 11, 2020, at 16:32, Makarius  wrote:
> 
> On 11/02/2020 03:20, Augie Fackler wrote:
>> I guess I'm not sure what's going on here. 
>> https://www.mercurial-scm.org/repo/hg/rev/5cda0ce05c42 is the revision that 
>> introduced that, but I'm not sure why you need to do anything /to 
>> phabricator/ unless it's trying (poorly) to pretend to be an hg server. Is 
>> it not just blindly proxying the hg protocol from your hg binary?
> 
> Thanks. This hint has helped me to look in the right spots.
> 
> Phabricator essentially starts a command-line process to learn about the
> server capabilities like this:
> 
>  echo capabilities | hg -R .../test-repo serve --stdio
> 
> It uses the result for its own http communication. In the output above, there
> used to be httpheader=1024 until 4.0.2, but it has disappeared in 4.1.
> 
> 
> See also the Phabricator sources:
> 
> https://github.com/phacility/phabricator/blob/master/src/applications/diffusion/controller/DiffusionServeController.php#L781
> 
> https://github.com/phacility/phabricator/blob/master/src/applications/diffusion/controller/DiffusionServeController.php#L838
> 
> https://github.com/phacility/phabricator/blob/master/src/applications/diffusion/protocol/DiffusionMercurialWireProtocol.php#L107
> 
> The latter contains further comments about slightly odd censorship of the
> capabilities. This is also the place where the workaround is inserted, see
> again https://isabelle-dev.sketis.net/T8
> 
> 
> A bisection over the hg repository yields the following relevant changeset:
> 
> changeset:   30563:e118233172fe
> user:Gregory Szorc 
> date:Mon Nov 28 20:46:42 2016 -0800
> files:   mercurial/wireproto.py tests/test-ssh-bundle1.t tests/test-ssh.t
> description:
> wireproto: only advertise HTTP-specific capabilities to HTTP peers (BC)
> 
> Previously, the capabilities list was protocol agnostic and we
> advertised the same capabilities list to all clients, regardless of
> transport protocol.
> 
> A few capabilities are specific to HTTP. I see no good reason why we
> should advertise them to SSH clients. So this patch limits their
> advertisement to HTTP clients.
> 
> This patch is BC, but SSH clients shouldn't be using the removed
> capabilities so there should be no impact.
> 
> 
> What means "BC"?

"breaking change" or "behavior change" depending who you ask. :)

> Maybe Phabricator is a good reason to keep the full information?

Not especially. They're doing it wrong assuming that the --stdio protocol is 
compatible with http, and we've told them that more than once and they've 
responded...negatively. If someone wanted to try and figure out a way to have 
phabricator do the right thing (invoke hg's WSGI application, probably via 
CGI?) that would probably help. I'd certainly be open to mentoring someone 
doing that work, but I can't justify spending time on it myself.

> Do you think you can refine that for a future release of Mercurial?

I'm not sure what you're asking for here...

> 
> In contrast, it is probably difficult to get a patch accepted by the
> Phabricator project, because they are only using rather old hg 2.8.2 for their
> main installation, and 2.6.2, 3.5.1 in their tests:
> https://github.com/phacility/phabricator/blob/master/src/applications/diffusion/protocol/__tests__/DiffusionMercurialWireProtocolTests.php
> 
> 
>   Makarius

___
Mercurial mailing list
Mercurial@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial


D6846: packaging: script the building of a MacOS installer using a custom python

2020-02-11 Thread indygreg (Gregory Szorc)
indygreg added inline comments.

INLINE COMMENTS

> downloads.py:29
> +'openssl': {
> +'url': 'https://www.openssl.org/source/openssl-1.0.2t.tar.gz',
> +'size': 5355422,

1.0.2u is out and should be used.

> python.py:54
> +   "darwin64-x86_64-cc",
> +   "enable-ec_nistp_64_gcc_128"],
> +  env=env,

Where did you get this line and `enable-cms` from?

> python.py:120
> +   "--with-threads",
> +   "--enable-optimizations"],
> +  env=env,

I _think_ we also want `--enable-lto` for more speed wins.

> build.py:170
> +fp.writelines(hgscript)
> +fp.truncate()
> +

`fp.truncate()`?

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6846/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6846

To: mharbison72, #hg-reviewers
Cc: indygreg, marmoute, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6846: packaging: script the building of a MacOS installer using a custom python

2020-02-11 Thread indygreg (Gregory Szorc)
indygreg added a comment.


  macOS supports a `@loader_path` and related magic tokens in rpath to load 
libraries relative to the current binary. See e.g. 
https://blogs.oracle.com/dipol/dynamic-libraries,-rpath,-and-mac-os and 
https://medium.com/@donblas/fun-with-rpath-otool-and-install-name-tool-e3e41ae86172
 for examples.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6846/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6846

To: mharbison72, #hg-reviewers
Cc: indygreg, marmoute, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7930: rust-status: update rust-cpython bridge to account for the changes in core

2020-02-11 Thread Raphaël Gomès
Alphare updated this revision to Diff 20158.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7930?vs=20049=20158

BRANCH
  default

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7930/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7930

AFFECTED FILES
  rust/hg-core/src/dirstate/status.rs
  rust/hg-cpython/src/dirstate.rs
  rust/hg-cpython/src/dirstate/status.rs
  rust/hg-cpython/src/exceptions.rs

CHANGE DETAILS

diff --git a/rust/hg-cpython/src/exceptions.rs 
b/rust/hg-cpython/src/exceptions.rs
--- a/rust/hg-cpython/src/exceptions.rs
+++ b/rust/hg-cpython/src/exceptions.rs
@@ -40,3 +40,5 @@
 }
 
 py_exception!(rustext, HgPathPyError, RuntimeError);
+py_exception!(rustext, FallbackError, RuntimeError);
+py_exception!(shared_ref, AlreadyBorrowed, RuntimeError);
diff --git a/rust/hg-cpython/src/dirstate/status.rs 
b/rust/hg-cpython/src/dirstate/status.rs
--- a/rust/hg-cpython/src/dirstate/status.rs
+++ b/rust/hg-cpython/src/dirstate/status.rs
@@ -9,33 +9,34 @@
 //! `hg-core` crate. From Python, this will be seen as
 //! `rustext.dirstate.status`.
 
-use crate::dirstate::DirstateMap;
-use cpython::exc::ValueError;
+use crate::{dirstate::DirstateMap, exceptions::FallbackError};
 use cpython::{
-ObjectProtocol, PyBytes, PyErr, PyList, PyObject, PyResult, PyTuple,
-Python, PythonObject, ToPyObject,
+exc::ValueError, ObjectProtocol, PyBytes, PyErr, PyList, PyObject,
+PyResult, PyTuple, Python, PythonObject, ToPyObject,
 };
-use hg::utils::hg_path::HgPathBuf;
 use hg::{
-matchers::{AlwaysMatcher, FileMatcher},
-status,
-utils::{files::get_path_from_bytes, hg_path::HgPath},
-DirstateStatus,
+matchers::{AlwaysMatcher, FileMatcher, IncludeMatcher},
+parse_pattern_syntax, status,
+utils::{
+files::{get_bytes_from_path, get_path_from_bytes},
+hg_path::{HgPath, HgPathBuf},
+},
+BadMatch, DirstateStatus, IgnorePattern, PatternFileWarning, StatusError,
+StatusOptions,
 };
-use std::borrow::Borrow;
+use std::borrow::{Borrow, Cow};
 
 /// This will be useless once trait impls for collection are added to `PyBytes`
 /// upstream.
-fn collect_pybytes_list>(
+fn collect_pybytes_list(
 py: Python,
-collection: &[P],
+collection: &[impl AsRef],
 ) -> PyList {
 let list = PyList::new(py, &[]);
 
-for (i, path) in collection.iter().enumerate() {
-list.insert(
+for path in collection.iter() {
+list.append(
 py,
-i,
 PyBytes::new(py, path.as_ref().as_bytes()).into_object(),
 )
 }
@@ -43,34 +44,97 @@
 list
 }
 
+fn collect_bad_matches(
+py: Python,
+collection: &[(impl AsRef, BadMatch)],
+) -> PyResult {
+let list = PyList::new(py, &[]);
+
+let os = py.import("os")?;
+let get_error_message = |code: i32| -> PyResult<_> {
+os.call(
+py,
+"strerror",
+PyTuple::new(py, &[code.to_py_object(py).into_object()]),
+None,
+)
+};
+
+for (path, bad_match) in collection.iter() {
+let message = match bad_match {
+BadMatch::OsError(code) => get_error_message(*code)?,
+BadMatch::BadType(bad_type) => format!(
+"unsupported file type (type is {})",
+bad_type.to_string()
+)
+.to_py_object(py)
+.into_object(),
+};
+list.append(
+py,
+(PyBytes::new(py, path.as_ref().as_bytes()), message)
+.to_py_object(py)
+.into_object(),
+)
+}
+
+Ok(list)
+}
+
+fn handle_fallback(py: Python, err: StatusError) -> PyErr {
+match err {
+StatusError::Pattern(e) => {
+PyErr::new::(py, e.to_string())
+}
+e => PyErr::new::(py, e.to_string()),
+}
+}
+
 pub fn status_wrapper(
 py: Python,
 dmap: DirstateMap,
 matcher: PyObject,
 root_dir: PyObject,
-list_clean: bool,
+ignore_files: PyList,
+check_exec: bool,
 last_normal_time: i64,
-check_exec: bool,
-) -> PyResult<(PyList, PyList, PyList, PyList, PyList, PyList, PyList)> {
+list_clean: bool,
+list_ignored: bool,
+list_unknown: bool,
+) -> PyResult {
 let bytes = root_dir.extract::(py)?;
 let root_dir = get_path_from_bytes(bytes.data(py));
 
 let dmap: DirstateMap = dmap.to_py_object(py);
 let dmap = dmap.get_inner(py);
 
+let ignore_files: PyResult> = ignore_files
+.iter(py)
+.map(|b| {
+let file = b.extract::(py)?;
+Ok(get_path_from_bytes(file.data(py)).to_owned())
+})
+.collect();
+let ignore_files = ignore_files?;
+
 match matcher.get_type(py).name(py).borrow() {
 "alwaysmatcher" => {
 let matcher = AlwaysMatcher;
-let (lookup, status_res) = status(
+let ((lookup, status_res), warnings) = status(
  

D7922: rust-matchers: add function to generate a regex matcher function

2020-02-11 Thread Raphaël Gomès
Alphare added inline comments.

INLINE COMMENTS

> martinvonz wrote in matchers.rs:224
> Hmm, I don't like to replicate this into Rust. I argued for a long time with 
> Boris over a year ago that we should see if we can remove it from Python. He 
> said they (Octobus, I think) would look into that if I would just queue the 
> workaround for the time being. Could you see if you can simplify the Python 
> code first instead?
> 
> See https://patchwork.mercurial-scm.org/patch/36755/ for discussion.

I am having a little trouble reading the patchwork thread, but I gather that 
you want to get rid of the preventive splitting of patterns and just let the 
regex engine handle it on its own? Was this measure taken because of a bug in 
Python's `re` or because its exceptions were too coarse/unusable?
I'll look into the behavior of Re2, in that case.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7922/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7922

To: Alphare, #hg-reviewers, pulkit, martinvonz
Cc: martinvonz, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7119: rust-dirstatemap: remove additional lookup in dirstate.matches

2020-02-11 Thread Raphaël Gomès
Alphare added a comment.


  In D7119#119792 , @marmoute 
wrote:
  
  > @Alphare so what should we do of this patch ?
  
  IMO it should still be valid, it's harmless at best. I don't remember having 
strong performance numbers. Now that most of the status is done in Rust, it 
should  matter less, even though it could still help shave off a few 
milliseconds.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7119/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7119

To: Alphare, #hg-reviewers
Cc: martinvonz, marmoute, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7922: rust-matchers: add function to generate a regex matcher function

2020-02-11 Thread Raphaël Gomès
Alphare updated this revision to Diff 20154.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7922?vs=19401=20154

BRANCH
  default

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7922/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7922

AFFECTED FILES
  rust/hg-core/src/lib.rs
  rust/hg-core/src/matchers.rs

CHANGE DETAILS

diff --git a/rust/hg-core/src/matchers.rs b/rust/hg-core/src/matchers.rs
--- a/rust/hg-core/src/matchers.rs
+++ b/rust/hg-core/src/matchers.rs
@@ -7,7 +7,12 @@
 
 //! Structs and types for matching files and directories.
 
-use crate::{utils::hg_path::HgPath, DirsMultiset, DirstateMapError};
+#[cfg(feature = "with-re2")]
+use crate::re2::Re2;
+use crate::{
+filepatterns::PatternResult, utils::hg_path::HgPath, DirsMultiset,
+DirstateMapError, PatternError,
+};
 use std::collections::HashSet;
 use std::iter::FromIterator;
 use std::ops::Deref;
@@ -215,6 +220,28 @@
 true
 }
 }
+
+const MAX_RE_SIZE: usize = 2;
+
+#[cfg(feature = "with-re2")]
+/// Returns a function that matches an `HgPath` against the given regex
+/// pattern.
+///
+/// This can fail when the pattern is invalid or not supported by the
+/// underlying engine `Re2`, for instance anything with back-references.
+fn re_matcher(
+pattern: &[u8],
+) -> PatternResult bool + Sync> {
+let regex = Re2::new(pattern);
+let regex = regex.map_err(|e| PatternError::UnsupportedSyntax(e))?;
+Ok(move |path: | regex.is_match(path.as_bytes()))
+}
+
+#[cfg(not(feature = "with-re2"))]
+fn re_matcher(_: &[u8]) -> PatternResult bool + Sync>> {
+Err(PatternError::Re2NotInstalled)
+}
+
 #[cfg(test)]
 mod tests {
 use super::*;
diff --git a/rust/hg-core/src/lib.rs b/rust/hg-core/src/lib.rs
--- a/rust/hg-core/src/lib.rs
+++ b/rust/hg-core/src/lib.rs
@@ -126,6 +126,9 @@
 /// Needed a pattern that can be turned into a regex but got one that
 /// can't. This should only happen through programmer error.
 NonRegexPattern(IgnorePattern),
+/// This is temporary, see `re2/mod.rs`.
+/// This will cause a fallback to Python.
+Re2NotInstalled,
 }
 
 impl ToString for PatternError {
@@ -148,6 +151,10 @@
 PatternError::NonRegexPattern(pattern) => {
 format!("'{:?}' cannot be turned into a regex", pattern)
 }
+PatternError::Re2NotInstalled => {
+"Re2 is not installed, cannot use regex functionality."
+.to_string()
+}
 }
 }
 }



To: Alphare, #hg-reviewers, pulkit, martinvonz
Cc: martinvonz, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7929: rust-status: add bare `hg status` support in hg-core

2020-02-11 Thread Raphaël Gomès
Alphare updated this revision to Diff 20157.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7929?vs=20048=20157

BRANCH
  default

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7929/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7929

AFFECTED FILES
  rust/hg-core/src/dirstate/status.rs
  rust/hg-core/src/lib.rs

CHANGE DETAILS

diff --git a/rust/hg-core/src/lib.rs b/rust/hg-core/src/lib.rs
--- a/rust/hg-core/src/lib.rs
+++ b/rust/hg-core/src/lib.rs
@@ -13,7 +13,9 @@
 dirs_multiset::{DirsMultiset, DirsMultisetIter},
 dirstate_map::DirstateMap,
 parsers::{pack_dirstate, parse_dirstate, PARENT_SIZE},
-status::{status, DirstateStatus, StatusOptions},
+status::{
+status, BadMatch, BadType, DirstateStatus, StatusError, StatusOptions,
+},
 CopyMap, CopyMapIter, DirstateEntry, DirstateParents, EntryState,
 StateMap, StateMapIter,
 };
diff --git a/rust/hg-core/src/dirstate/status.rs 
b/rust/hg-core/src/dirstate/status.rs
--- a/rust/hg-core/src/dirstate/status.rs
+++ b/rust/hg-core/src/dirstate/status.rs
@@ -11,22 +11,31 @@
 
 use crate::{
 dirstate::SIZE_FROM_OTHER_PARENT,
-matchers::{Matcher, VisitChildrenSet},
+filepatterns::PatternFileWarning,
+matchers::{get_ignore_function, Matcher, VisitChildrenSet},
 utils::{
-files::HgMetadata,
+files::{find_dirs, HgMetadata},
 hg_path::{
 hg_path_to_path_buf, os_string_to_hg_path_buf, HgPath, HgPathBuf,
+HgPathError,
 },
+path_auditor::PathAuditor,
 },
 CopyMap, DirstateEntry, DirstateMap, EntryState, FastHashMap,
+PatternError,
 };
+use lazy_static::lazy_static;
 use rayon::prelude::*;
-use std::borrow::Cow;
-use std::collections::{HashSet, VecDeque};
-use std::fs::{read_dir, DirEntry};
-use std::io::ErrorKind;
-use std::ops::Deref;
-use std::path::Path;
+use std::collections::VecDeque;
+use std::{
+borrow::Cow,
+collections::HashSet,
+fs::{read_dir, DirEntry},
+io::ErrorKind,
+ops::Deref,
+path::Path,
+sync::mpsc,
+};
 
 /// Wrong type of file from a `BadMatch`
 /// Note: a lot of those don't exist on all platforms.
@@ -50,6 +59,7 @@
 /// Marker enum used to dispatch new status entries into the right collections.
 /// Is similar to `crate::EntryState`, but represents the transient state of
 /// entries during the lifetime of a command.
+#[derive(Debug)]
 enum Dispatch {
 Unsure,
 Modified,
@@ -150,7 +160,7 @@
 } else if options.list_clean {
 Dispatch::Clean
 } else {
-Dispatch::Unknown
+Dispatch::None
 }
 }
 EntryState::Merged => Dispatch::Modified,
@@ -174,57 +184,95 @@
 }
 }
 
+lazy_static! {
+static ref DEFAULT_WORK: HashSet<&'static HgPath> = {
+let mut h = HashSet::new();
+h.insert(HgPath::new(b""));
+h
+};
+}
+
 /// Get stat data about the files explicitly specified by match.
 /// TODO subrepos
 fn walk_explicit<'a>(
-files: &'a HashSet<>,
+files: Option<&'a HashSet<>>,
 dmap: &'a DirstateMap,
 root_dir: impl AsRef + Sync + Send,
+work: mpsc::Sender<&'a HgPath>,
 options: StatusOptions,
-) -> impl ParallelIterator> {
-files.par_iter().filter_map(move |filename| {
-// TODO normalization
-let normalized = filename.as_ref();
+) -> impl ParallelIterator, Dispatch)>> {
+files
+.unwrap_or(_WORK)
+.par_iter()
+.map_with(work, move |work, filename| {
+// TODO normalization
+let normalized = filename.as_ref();
 
-let buf = match hg_path_to_path_buf(normalized) {
-Ok(x) => x,
-Err(e) => return Some(Err(e.into())),
-};
-let target = root_dir.as_ref().join(buf);
-let st = target.symlink_metadata();
-match st {
-Ok(meta) => {
-let file_type = meta.file_type();
-if file_type.is_file() || file_type.is_symlink() {
-if let Some(entry) = dmap.get(normalized) {
+let buf = match hg_path_to_path_buf(normalized) {
+Ok(x) => x,
+Err(e) => return Some(Err(e.into())),
+};
+let target = root_dir.as_ref().join(buf);
+let st = target.symlink_metadata();
+let in_dmap = dmap.get(normalized);
+match st {
+Ok(meta) => {
+let file_type = meta.file_type();
+if file_type.is_file() || file_type.is_symlink() {
+if let Some(entry) = in_dmap {
+return Some(Ok((
+Cow::Borrowed(normalized),
+dispatch_found(
+,
+*entry,
+

D7922: rust-matchers: add function to generate a regex matcher function

2020-02-11 Thread Raphaël Gomès
Alphare added inline comments.

INLINE COMMENTS

> Alphare wrote in matchers.rs:224
> I am having a little trouble reading the patchwork thread, but I gather that 
> you want to get rid of the preventive splitting of patterns and just let the 
> regex engine handle it on its own? Was this measure taken because of a bug in 
> Python's `re` or because its exceptions were too coarse/unusable?
> I'll look into the behavior of Re2, in that case.

I've looked into the behavior of Re2. It will return an error if the DFA runs 
out of memory, which seems perfectly reasonable. 
I will simplify the Rust code, however I feel like you're better suited than I 
am to fix the Python side of things, since I don't really understand the 
ins-and-outs of the problem.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7922/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7922

To: Alphare, #hg-reviewers, pulkit, martinvonz
Cc: martinvonz, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7922: rust-matchers: add function to generate a regex matcher function

2020-02-11 Thread Raphaël Gomès
Alphare updated this revision to Diff 20160.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7922?vs=20154=20160

BRANCH
  default

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7922/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7922

AFFECTED FILES
  rust/hg-core/src/lib.rs
  rust/hg-core/src/matchers.rs

CHANGE DETAILS

diff --git a/rust/hg-core/src/matchers.rs b/rust/hg-core/src/matchers.rs
--- a/rust/hg-core/src/matchers.rs
+++ b/rust/hg-core/src/matchers.rs
@@ -7,7 +7,12 @@
 
 //! Structs and types for matching files and directories.
 
-use crate::{utils::hg_path::HgPath, DirsMultiset, DirstateMapError};
+#[cfg(feature = "with-re2")]
+use crate::re2::Re2;
+use crate::{
+filepatterns::PatternResult, utils::hg_path::HgPath, DirsMultiset,
+DirstateMapError, PatternError,
+};
 use std::collections::HashSet;
 use std::iter::FromIterator;
 use std::ops::Deref;
@@ -215,6 +220,26 @@
 true
 }
 }
+
+#[cfg(feature = "with-re2")]
+/// Returns a function that matches an `HgPath` against the given regex
+/// pattern.
+///
+/// This can fail when the pattern is invalid or not supported by the
+/// underlying engine `Re2`, for instance anything with back-references.
+fn re_matcher(
+pattern: &[u8],
+) -> PatternResult bool + Sync> {
+let regex = Re2::new(pattern);
+let regex = regex.map_err(|e| PatternError::UnsupportedSyntax(e))?;
+Ok(move |path: | regex.is_match(path.as_bytes()))
+}
+
+#[cfg(not(feature = "with-re2"))]
+fn re_matcher(_: &[u8]) -> PatternResult bool + Sync>> {
+Err(PatternError::Re2NotInstalled)
+}
+
 #[cfg(test)]
 mod tests {
 use super::*;
diff --git a/rust/hg-core/src/lib.rs b/rust/hg-core/src/lib.rs
--- a/rust/hg-core/src/lib.rs
+++ b/rust/hg-core/src/lib.rs
@@ -126,6 +126,9 @@
 /// Needed a pattern that can be turned into a regex but got one that
 /// can't. This should only happen through programmer error.
 NonRegexPattern(IgnorePattern),
+/// This is temporary, see `re2/mod.rs`.
+/// This will cause a fallback to Python.
+Re2NotInstalled,
 }
 
 impl ToString for PatternError {
@@ -148,6 +151,10 @@
 PatternError::NonRegexPattern(pattern) => {
 format!("'{:?}' cannot be turned into a regex", pattern)
 }
+PatternError::Re2NotInstalled => {
+"Re2 is not installed, cannot use regex functionality."
+.to_string()
+}
 }
 }
 }



To: Alphare, #hg-reviewers, pulkit, martinvonz
Cc: martinvonz, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D8086: rust-status: refactor options into a `StatusOptions` struct

2020-02-11 Thread Raphaël Gomès
Alphare updated this revision to Diff 20155.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D8086?vs=20044=20155

BRANCH
  default

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D8086/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D8086

AFFECTED FILES
  rust/hg-core/src/dirstate/status.rs
  rust/hg-core/src/lib.rs
  rust/hg-core/src/matchers.rs

CHANGE DETAILS

diff --git a/rust/hg-core/src/matchers.rs b/rust/hg-core/src/matchers.rs
--- a/rust/hg-core/src/matchers.rs
+++ b/rust/hg-core/src/matchers.rs
@@ -704,7 +704,11 @@
 
 assert_eq!(
 roots_dirs_and_parents().unwrap(),
-RootsDirsAndParents {roots, dirs, parents}
+RootsDirsAndParents {
+roots,
+dirs,
+parents
+}
 );
 }
 
diff --git a/rust/hg-core/src/lib.rs b/rust/hg-core/src/lib.rs
--- a/rust/hg-core/src/lib.rs
+++ b/rust/hg-core/src/lib.rs
@@ -13,7 +13,7 @@
 dirs_multiset::{DirsMultiset, DirsMultisetIter},
 dirstate_map::DirstateMap,
 parsers::{pack_dirstate, parse_dirstate, PARENT_SIZE},
-status::{status, StatusResult},
+status::{status, StatusOptions, StatusResult},
 CopyMap, CopyMapIter, DirstateEntry, DirstateParents, EntryState,
 StateMap, StateMapIter,
 };
diff --git a/rust/hg-core/src/dirstate/status.rs 
b/rust/hg-core/src/dirstate/status.rs
--- a/rust/hg-core/src/dirstate/status.rs
+++ b/rust/hg-core/src/dirstate/status.rs
@@ -83,9 +83,7 @@
 entry: DirstateEntry,
 metadata: HgMetadata,
 copy_map: ,
-check_exec: bool,
-list_clean: bool,
-last_normal_time: i64,
+options: StatusOptions,
 ) -> Dispatch {
 let DirstateEntry {
 state,
@@ -105,7 +103,7 @@
 EntryState::Normal => {
 let size_changed = mod_compare(size, st_size as i32);
 let mode_changed =
-(mode ^ st_mode as i32) & 0o100 != 0o000 && check_exec;
+(mode ^ st_mode as i32) & 0o100 != 0o000 && options.check_exec;
 let metadata_changed = size >= 0 && (size_changed || mode_changed);
 let other_parent = size == SIZE_FROM_OTHER_PARENT;
 if metadata_changed
@@ -115,14 +113,14 @@
 Dispatch::Modified
 } else if mod_compare(mtime, st_mtime as i32) {
 Dispatch::Unsure
-} else if st_mtime == last_normal_time {
+} else if st_mtime == options.last_normal_time {
 // the file may have just been marked as normal and
 // it may have changed in the same second without
 // changing its size. This can happen if we quickly
 // do multiple commits. Force lookup, so we don't
 // miss such a racy file change.
 Dispatch::Unsure
-} else if list_clean {
+} else if options.list_clean {
 Dispatch::Clean
 } else {
 Dispatch::Unknown
@@ -155,9 +153,7 @@
 files: &'a HashSet<>,
 dmap: &'a DirstateMap,
 root_dir: impl AsRef + Sync + Send,
-check_exec: bool,
-list_clean: bool,
-last_normal_time: i64,
+options: StatusOptions,
 ) -> impl ParallelIterator> {
 files.par_iter().filter_map(move |filename| {
 // TODO normalization
@@ -181,9 +177,7 @@
 *entry,
 HgMetadata::from_metadata(meta),
 _map,
-check_exec,
-list_clean,
-last_normal_time,
+options,
 ),
 )));
 }
@@ -206,14 +200,23 @@
 })
 }
 
+#[derive(Debug, Copy, Clone)]
+pub struct StatusOptions {
+/// Remember the most recent modification timeslot for status, to make
+/// sure we won't miss future size-preserving file content modifications
+/// that happen within the same timeslot.
+pub last_normal_time: i64,
+/// Whether we are on a filesystem with UNIX-like exec flags
+pub check_exec: bool,
+pub list_clean: bool,
+}
+
 /// Stat all entries in the `DirstateMap` and mark them for dispatch into
 /// the relevant collections.
 fn stat_dmap_entries(
 dmap: ,
 root_dir: impl AsRef + Sync + Send,
-check_exec: bool,
-list_clean: bool,
-last_normal_time: i64,
+options: StatusOptions,
 ) -> impl ParallelIterator> {
 dmap.par_iter().map(move |(filename, entry)| {
 let filename:  = filename;
@@ -234,9 +237,7 @@
 *entry,
 HgMetadata::from_metadata(m),
 _map,
-check_exec,
-list_clean,
-last_normal_time,
+options,
 ),
 )),
 Err(ref e)
@@ -303,31 

D7931: rust-status: use bare hg status fastpath from Python

2020-02-11 Thread Raphaël Gomès
Alphare updated this revision to Diff 20159.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7931?vs=20041=20159

BRANCH
  default

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7931/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7931

AFFECTED FILES
  mercurial/dirstate.py
  mercurial/match.py
  tests/test-subrepo-deep-nested-change.t

CHANGE DETAILS

diff --git a/tests/test-subrepo-deep-nested-change.t 
b/tests/test-subrepo-deep-nested-change.t
--- a/tests/test-subrepo-deep-nested-change.t
+++ b/tests/test-subrepo-deep-nested-change.t
@@ -355,6 +355,11 @@
   R sub1/sub2/folder/test.txt
   ! sub1/.hgsub
   ? sub1/x.hgsub
+  $ hg status -R sub1
+  warning: subrepo spec file 'sub1/.hgsub' not found
+  R .hgsubstate
+  ! .hgsub
+  ? x.hgsub
   $ mv sub1/x.hgsub sub1/.hgsub
   $ hg update -Cq
   $ touch sub1/foo
diff --git a/mercurial/match.py b/mercurial/match.py
--- a/mercurial/match.py
+++ b/mercurial/match.py
@@ -45,6 +45,7 @@
 
 propertycache = util.propertycache
 
+rustmod = policy.importrust('dirstate')
 
 def _rematcher(regex):
 '''compile the regexp with the best available regexp engine and return a
@@ -666,7 +667,10 @@
 class includematcher(basematcher):
 def __init__(self, root, kindpats, badfn=None):
 super(includematcher, self).__init__(badfn)
-
+if rustmod is not None:
+# We need to pass the patterns to Rust because they can contain
+# patterns from the user interface
+self._kindpats = kindpats
 self._pats, self.matchfn = _buildmatch(kindpats, b'(?:/|$)', root)
 self._prefix = _prefix(kindpats)
 roots, dirs, parents = _rootsdirsandparents(kindpats)
diff --git a/mercurial/dirstate.py b/mercurial/dirstate.py
--- a/mercurial/dirstate.py
+++ b/mercurial/dirstate.py
@@ -27,6 +27,7 @@
 policy,
 pycompat,
 scmutil,
+sparse,
 txnutil,
 util,
 )
@@ -1083,7 +1084,7 @@
 results[next(iv)] = st
 return results
 
-def _rust_status(self, matcher, list_clean):
+def _rust_status(self, matcher, list_clean, list_ignored, list_unknown):
 # Force Rayon (Rust parallelism library) to respect the number of
 # workers. This is a temporary workaround until Rust code knows
 # how to read the config file.
@@ -1101,16 +1102,45 @@
 added,
 removed,
 deleted,
+clean,
+ignored,
 unknown,
-clean,
+warnings,
+bad,
 ) = rustmod.status(
 self._map._rustmap,
 matcher,
 self._rootdir,
-bool(list_clean),
+self._ignorefiles(),
+self._checkexec,
 self._lastnormaltime,
-self._checkexec,
+bool(list_clean),
+bool(list_ignored),
+bool(list_unknown),
 )
+if self._ui.warn:
+for item in warnings:
+if isinstance(item, tuple):
+file_path, syntax = item
+msg = _(b"%s: ignoring invalid syntax '%s'\n") % (
+file_path,
+syntax,
+)
+self._ui.warn(msg)
+else:
+msg = _(b"skipping unreadable pattern file '%s': %s\n")
+self._ui.warn(
+msg
+% (
+pathutil.canonpath(
+self._rootdir, self._rootdir, item
+),
+b"No such file or directory",
+)
+)
+
+for (fn, message) in bad:
+matcher.bad(fn, encoding.strtolocal(message))
 
 status = scmutil.status(
 modified=modified,
@@ -1118,7 +1148,7 @@
 removed=removed,
 deleted=deleted,
 unknown=unknown,
-ignored=[],
+ignored=ignored,
 clean=clean,
 )
 return (lookup, status)
@@ -1148,26 +1178,35 @@
 
 use_rust = True
 
-allowed_matchers = (matchmod.alwaysmatcher, matchmod.exactmatcher)
+allowed_matchers = (
+matchmod.alwaysmatcher,
+matchmod.exactmatcher,
+matchmod.includematcher,
+)
 
 if rustmod is None:
 use_rust = False
+elif self._checkcase:
+# Case-insensitive filesystems are not handled yet
+use_rust = False
 elif subrepos:
 use_rust = False
-elif bool(listunknown):
-# Pathauditor does not exist yet in Rust, unknown files
-# can't be trusted.
+elif sparse.enabled:
 use_rust = False
-elif self._ignorefiles() and listignored:
-# Rust has no ignore mechanism yet, so don't use Rust for

D7924: rust-matchers: add `build_regex_match` function

2020-02-11 Thread Raphaël Gomès
Alphare updated this revision to Diff 20162.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7924?vs=20042=20162

BRANCH
  default

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7924/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7924

AFFECTED FILES
  rust/hg-core/src/matchers.rs

CHANGE DETAILS

diff --git a/rust/hg-core/src/matchers.rs b/rust/hg-core/src/matchers.rs
--- a/rust/hg-core/src/matchers.rs
+++ b/rust/hg-core/src/matchers.rs
@@ -10,7 +10,7 @@
 #[cfg(feature = "with-re2")]
 use crate::re2::Re2;
 use crate::{
-filepatterns::PatternResult,
+filepatterns::{build_single_regex, PatternResult},
 utils::hg_path::{HgPath, HgPathBuf},
 DirsMultiset, DirstateMapError, IgnorePattern, PatternError,
 PatternSyntax,
@@ -242,6 +242,24 @@
 Err(PatternError::Re2NotInstalled)
 }
 
+/// Returns the regex pattern and a function that matches an `HgPath` against
+/// said regex formed by the given ignore patterns.
+fn build_regex_match<'a>(
+ignore_patterns: &'a [&'a IgnorePattern],
+) -> PatternResult<(Vec, Box bool + Sync>)> {
+let regexps: Result, PatternError> = ignore_patterns
+.into_iter()
+.map(|k| build_single_regex(*k))
+.collect();
+let regexps = regexps?;
+let full_regex = regexps.join('|');
+
+let matcher = re_matcher(_regex)?;
+let func = Box::new(move |filename: | matcher(filename));
+
+Ok((full_regex, func))
+}
+
 /// Returns roots and directories corresponding to each pattern.
 ///
 /// This calculates the roots and directories exactly matching the patterns and



To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7923: rust-matchers: add functions to get roots, dirs and parents from patterns

2020-02-11 Thread Raphaël Gomès
Alphare updated this revision to Diff 20161.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7923?vs=19939=20161

BRANCH
  default

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7923/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7923

AFFECTED FILES
  rust/hg-core/src/matchers.rs

CHANGE DETAILS

diff --git a/rust/hg-core/src/matchers.rs b/rust/hg-core/src/matchers.rs
--- a/rust/hg-core/src/matchers.rs
+++ b/rust/hg-core/src/matchers.rs
@@ -10,8 +10,10 @@
 #[cfg(feature = "with-re2")]
 use crate::re2::Re2;
 use crate::{
-filepatterns::PatternResult, utils::hg_path::HgPath, DirsMultiset,
-DirstateMapError, PatternError,
+filepatterns::PatternResult,
+utils::hg_path::{HgPath, HgPathBuf},
+DirsMultiset, DirstateMapError, IgnorePattern, PatternError,
+PatternSyntax,
 };
 use std::collections::HashSet;
 use std::iter::FromIterator;
@@ -240,10 +242,156 @@
 Err(PatternError::Re2NotInstalled)
 }
 
+/// Returns roots and directories corresponding to each pattern.
+///
+/// This calculates the roots and directories exactly matching the patterns and
+/// returns a tuple of (roots, dirs). It does not return other directories
+/// which may also need to be considered, like the parent directories.
+fn roots_and_dirs(
+ignore_patterns: &[IgnorePattern],
+) -> (Vec, Vec) {
+let mut roots = Vec::new();
+let mut dirs = Vec::new();
+
+for ignore_pattern in ignore_patterns {
+let IgnorePattern {
+syntax, pattern, ..
+} = ignore_pattern;
+match syntax {
+PatternSyntax::RootGlob | PatternSyntax::Glob => {
+let mut root = vec![];
+
+for p in pattern.split(|c| *c == b'/') {
+if p.iter().any(|c| match *c {
+b'[' | b'{' | b'*' | b'?' => true,
+_ => false,
+}) {
+break;
+}
+root.push(HgPathBuf::from_bytes(p));
+}
+let buf =
+root.iter().fold(HgPathBuf::new(), |acc, r| acc.join(r));
+roots.push(buf);
+}
+PatternSyntax::Path | PatternSyntax::RelPath => {
+let pat = HgPath::new(if pattern == b"." {
+&[] as &[u8]
+} else {
+pattern
+});
+roots.push(pat.to_owned());
+}
+PatternSyntax::RootFiles => {
+let pat = if pattern == b"." {
+&[] as &[u8]
+} else {
+pattern
+};
+dirs.push(HgPathBuf::from_bytes(pat));
+}
+_ => {
+roots.push(HgPathBuf::new());
+}
+}
+}
+(roots, dirs)
+}
+
+/// Paths extracted from patterns
+#[derive(Debug, PartialEq)]
+struct RootsDirsAndParents {
+/// Directories to match recursively
+pub roots: HashSet,
+/// Directories to match non-recursively
+pub dirs: HashSet,
+/// Implicitly required directories to go to items in either roots or dirs
+pub parents: HashSet,
+}
+
+/// Extract roots, dirs and parents from patterns.
+fn roots_dirs_and_parents(
+ignore_patterns: &[IgnorePattern],
+) -> PatternResult {
+let (roots, dirs) = roots_and_dirs(ignore_patterns);
+
+let mut parents = HashSet::new();
+
+parents.extend(
+DirsMultiset::from_manifest()
+.map_err(|e| match e {
+DirstateMapError::InvalidPath(e) => e,
+_ => unreachable!(),
+})?
+.iter()
+.map(|k| k.to_owned()),
+);
+parents.extend(
+DirsMultiset::from_manifest()
+.map_err(|e| match e {
+DirstateMapError::InvalidPath(e) => e,
+_ => unreachable!(),
+})?
+.iter()
+.map(|k| k.to_owned()),
+);
+
+Ok(RootsDirsAndParents {
+roots: HashSet::from_iter(roots),
+dirs: HashSet::from_iter(dirs),
+parents,
+})
+}
+
 #[cfg(test)]
 mod tests {
 use super::*;
 use pretty_assertions::assert_eq;
+use std::path::Path;
+
+#[test]
+fn test_roots_and_dirs() {
+let pats = vec![
+IgnorePattern::new(PatternSyntax::Glob, b"g/h/*", Path::new("")),
+IgnorePattern::new(PatternSyntax::Glob, b"g/h", Path::new("")),
+IgnorePattern::new(PatternSyntax::Glob, b"g*", Path::new("")),
+];
+let (roots, dirs) = roots_and_dirs();
+
+assert_eq!(
+roots,
+vec!(
+HgPathBuf::from_bytes(b"g/h"),
+HgPathBuf::from_bytes(b"g/h"),
+HgPathBuf::new()
+),
+);
+assert_eq!(dirs, vec!());
+}
+
+#[test]
+fn test_roots_dirs_and_parents() {
+

D8087: rust-status: rename `StatusResult` to `DirstateStatus`

2020-02-11 Thread Raphaël Gomès
Alphare updated this revision to Diff 20156.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D8087?vs=20045=20156

BRANCH
  default

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D8087/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D8087

AFFECTED FILES
  rust/hg-core/src/dirstate/status.rs
  rust/hg-core/src/lib.rs
  rust/hg-cpython/src/dirstate/status.rs

CHANGE DETAILS

diff --git a/rust/hg-cpython/src/dirstate/status.rs 
b/rust/hg-cpython/src/dirstate/status.rs
--- a/rust/hg-cpython/src/dirstate/status.rs
+++ b/rust/hg-cpython/src/dirstate/status.rs
@@ -20,7 +20,7 @@
 matchers::{AlwaysMatcher, FileMatcher},
 status,
 utils::{files::get_path_from_bytes, hg_path::HgPath},
-StatusResult,
+DirstateStatus,
 };
 use std::borrow::Borrow;
 
@@ -114,7 +114,7 @@
 
 fn build_response(
 lookup: Vec<>,
-status_res: StatusResult,
+status_res: DirstateStatus,
 py: Python,
 ) -> PyResult<(PyList, PyList, PyList, PyList, PyList, PyList, PyList)> {
 let modified = collect_pybytes_list(py, status_res.modified.as_ref());
diff --git a/rust/hg-core/src/lib.rs b/rust/hg-core/src/lib.rs
--- a/rust/hg-core/src/lib.rs
+++ b/rust/hg-core/src/lib.rs
@@ -13,7 +13,7 @@
 dirs_multiset::{DirsMultiset, DirsMultisetIter},
 dirstate_map::DirstateMap,
 parsers::{pack_dirstate, parse_dirstate, PARENT_SIZE},
-status::{status, StatusOptions, StatusResult},
+status::{status, DirstateStatus, StatusOptions},
 CopyMap, CopyMapIter, DirstateEntry, DirstateParents, EntryState,
 StateMap, StateMapIter,
 };
diff --git a/rust/hg-core/src/dirstate/status.rs 
b/rust/hg-core/src/dirstate/status.rs
--- a/rust/hg-core/src/dirstate/status.rs
+++ b/rust/hg-core/src/dirstate/status.rs
@@ -255,7 +255,7 @@
 })
 }
 
-pub struct StatusResult<'a> {
+pub struct DirstateStatus<'a> {
 pub modified: Vec<&'a HgPath>,
 pub added: Vec<&'a HgPath>,
 pub removed: Vec<&'a HgPath>,
@@ -267,7 +267,7 @@
 
 fn build_response<'a>(
 results: impl IntoIterator>,
-) -> IoResult<(Vec<&'a HgPath>, StatusResult<'a>)> {
+) -> IoResult<(Vec<&'a HgPath>, DirstateStatus<'a>)> {
 let mut lookup = vec![];
 let mut modified = vec![];
 let mut added = vec![];
@@ -290,7 +290,7 @@
 
 Ok((
 lookup,
-StatusResult {
+DirstateStatus {
 modified,
 added,
 removed,
@@ -305,7 +305,7 @@
 matcher: &'b impl Matcher,
 root_dir: impl AsRef + Sync + Send + Copy,
 options: StatusOptions,
-) -> IoResult<(Vec<&'c HgPath>, StatusResult<'c>)> {
+) -> IoResult<(Vec<&'c HgPath>, DirstateStatus<'c>)> {
 let files = matcher.file_set();
 let mut results = vec![];
 if let Some(files) = files {



To: Alphare, #hg-reviewers, kevincox
Cc: kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7797: rust-nodemap: pure Rust example

2020-02-11 Thread marmoute (Pierre-Yves David)
marmoute updated this revision to Diff 20166.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7797?vs=20026=20166

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7797/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7797

AFFECTED FILES
  rust/Cargo.lock
  rust/hg-core/Cargo.toml
  rust/hg-core/examples/nodemap/index.rs
  rust/hg-core/examples/nodemap/main.rs

CHANGE DETAILS

diff --git a/rust/hg-core/examples/nodemap/main.rs 
b/rust/hg-core/examples/nodemap/main.rs
new file mode 100644
--- /dev/null
+++ b/rust/hg-core/examples/nodemap/main.rs
@@ -0,0 +1,150 @@
+// Copyright 2019-2020 Georges Racinet 
+//
+// This software may be used and distributed according to the terms of the
+// GNU General Public License version 2 or any later version.
+
+extern crate clap;
+extern crate hg;
+extern crate memmap;
+
+use clap::*;
+use hg::revlog::node::*;
+use hg::revlog::nodemap::*;
+use hg::revlog::*;
+use memmap::MmapOptions;
+use rand::Rng;
+use std::fs::File;
+use std::io;
+use std::io::Write;
+use std::path::{Path, PathBuf};
+use std::str::FromStr;
+use std::time::Instant;
+
+mod index;
+use index::Index;
+
+fn mmap_index(repo_path: ) -> Index {
+let mut path = PathBuf::from(repo_path);
+path.extend([".hg", "store", "00changelog.i"].iter());
+Index::load_mmap(path)
+}
+
+fn mmap_nodemap(path: ) -> NodeTree {
+let file = File::open(path).unwrap();
+let mmap = unsafe { MmapOptions::new().map().unwrap() };
+let len = mmap.len();
+NodeTree::load_bytes(Box::new(mmap), len)
+}
+
+/// Scan the whole index and create the corresponding nodemap file at `path`
+fn create(index: , path: ) -> io::Result<()> {
+let mut file = File::create(path)?;
+let start = Instant::now();
+let mut nm = NodeTree::default();
+for rev in 0..index.len() {
+let rev = rev as Revision;
+nm.insert(index, index.node(rev).unwrap(), rev).unwrap();
+}
+eprintln!("Nodemap constructed in RAM in {:?}", start.elapsed());
+file.write(_readonly_and_added_bytes().1)?;
+eprintln!("Nodemap written to disk");
+Ok(())
+}
+
+fn query(index: , nm: , prefix: ) {
+let start = Instant::now();
+let res = nm.find_hex(index, prefix);
+println!("Result found in {:?}: {:?}", start.elapsed(), res);
+}
+
+fn bench(index: , nm: , queries: usize) {
+let len = index.len() as u32;
+let mut rng = rand::thread_rng();
+let nodes: Vec = (0..queries)
+.map(|_| {
+index
+.node((rng.gen::() % len) as Revision)
+.unwrap()
+.clone()
+})
+.collect();
+if queries < 10 {
+let nodes_hex: Vec =
+nodes.iter().map(|n| n.encode_hex()).collect();
+println!("Nodes: {:?}", nodes_hex);
+}
+let mut last: Option = None;
+let start = Instant::now();
+for node in nodes.iter() {
+last = nm.find_bin(index, node.into()).unwrap();
+}
+let elapsed = start.elapsed();
+println!(
+"Did {} queries in {:?} (mean {:?}), last was {:?} with result {:?}",
+queries,
+elapsed,
+elapsed / (queries as u32),
+nodes.last().unwrap().encode_hex(),
+last
+);
+}
+
+fn main() {
+let matches = App::new("Nodemap pure Rust example")
+.arg(
+Arg::with_name("REPOSITORY")
+.help("Path to the repository, always necessary for its index")
+.required(true),
+)
+.arg(
+Arg::with_name("NODEMAP_FILE")
+.help("Path to the nodemap file, independent of REPOSITORY")
+.required(true),
+)
+.subcommand(
+SubCommand::with_name("create")
+.about("Create NODEMAP_FILE by scanning repository index"),
+)
+.subcommand(
+SubCommand::with_name("query")
+.about("Query NODEMAP_FILE for PREFIX")
+.arg(Arg::with_name("PREFIX").required(true)),
+)
+.subcommand(
+SubCommand::with_name("bench")
+.about(
+"Perform #QUERIES random successful queries on 
NODEMAP_FILE")
+.arg(Arg::with_name("QUERIES").required(true)),
+)
+.get_matches();
+
+let repo = matches.value_of("REPOSITORY").unwrap();
+let nm_path = matches.value_of("NODEMAP_FILE").unwrap();
+
+let index = mmap_index(::new(repo));
+
+if let Some(_) = matches.subcommand_matches("create") {
+println!("Creating nodemap file {} for repository {}", nm_path, repo);
+create(, ::new(nm_path)).unwrap();
+return;
+}
+
+let nm = mmap_nodemap(::new(nm_path));
+if let Some(matches) = matches.subcommand_matches("query") {
+let prefix = matches.value_of("PREFIX").unwrap();
+println!(
+"Querying {} in nodemap file {} of repository {}",
+prefix, nm_path, repo
+   

Mercurial API push bookmark

2020-02-11 Thread Pierre Augier

Hi,

I try to write a small Mercurial extension. One of the command would 
have to do something like


hg pull
hg up default
hg bookmark master
hg sum
hg push git+ssh://g...@github.com/fluiddyn/fluiddyn
hg bookmark master -d

These commands work fine (with hg-git). However, when I try to write 
this with the Mercurial API, i.e.


@command(b"fluiddyn-push-github", [])
def fluiddyn_push_github(ui, repo, **opts):
    commands.pull(ui, repo)
    commands.update(ui, repo, "default")
    commands.bookmark(ui, repo, "master")
    commands.summary(ui, repo)
    default = dict(repo.ui.configitems(b"paths", untrusted=False))["default"]
    if "foss.heptapod.net" not in default:
    ui.write("default points to a wrong path")
    return
    package_name = os.path.split(default)[1]
    path_github = os.path.join(github_base, package_name)
    commands.push(ui, repo, dest=path_github, bookmark="master")
    commands.bookmark(ui, repo, "master", delete=True)

I get

pulling from ssh://h...@foss.heptapod.net/fluiddyn/fluiddyn
searching for changes
no changes found
0 files updated, 0 files merged, 0 files removed, 0 files unresolved
(leaving bookmark master)
parent: 551:177f8acf6d42 tip
 Try to understand the problem with SKIP_SHTNS
branch: default
bookmarks: *master
commit: (clean)
update: (current)
pushing to git+ssh://g...@github.com/fluiddyn/fluiddyn
searching for changes
abort: revision  cannot be pushed since it doesn't have a bookmark

I don't understand why Mercurial tells me something about revision 
 as if it was trying to push this revision when


commands.push(ui, repo, dest=path_github, bookmark="master")

is called.

Help would be greatly appreciated.

--
Pierre Augier - CR CNRS http://www.legi.grenoble-inp.fr
LEGI (UMR 5519) Laboratoire des Ecoulements Geophysiques et Industriels
BP53, 38041 Grenoble Cedex, Francetel:+33.4.56.52.86.16

___
Mercurial mailing list
Mercurial@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial


D7894: nodemap: introduce an option to use mmap to read the nodemap mapping

2020-02-11 Thread marmoute (Pierre-Yves David)
Herald added a reviewer: indygreg.
marmoute updated this revision to Diff 20165.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7894?vs=20018=20165

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7894/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7894

AFFECTED FILES
  mercurial/configitems.py
  mercurial/debugcommands.py
  mercurial/localrepo.py
  mercurial/revlog.py
  mercurial/revlogutils/nodemap.py
  tests/test-persistent-nodemap.t

CHANGE DETAILS

diff --git a/tests/test-persistent-nodemap.t b/tests/test-persistent-nodemap.t
--- a/tests/test-persistent-nodemap.t
+++ b/tests/test-persistent-nodemap.t
@@ -84,3 +84,37 @@
   $ hg debugnodemap --check
   revision in index:   5002
   revision in nodemap: 5002
+
+Test code path without mmap
+---
+
+  $ echo bar > bar
+  $ hg add bar
+  $ hg ci -m 'bar' --config experimental.exp-persistent-nodemap.mmap=no
+
+  $ hg debugnodemap --check --config 
experimental.exp-persistent-nodemap.mmap=yes
+  revision in index:   5003
+  revision in nodemap: 5003
+  $ hg debugnodemap --check --config 
experimental.exp-persistent-nodemap.mmap=no
+  revision in index:   5003
+  revision in nodemap: 5003
+
+
+#if pure
+  $ hg debugnodemap --metadata
+  uid:  (glob)
+  tip-rev: 5002
+  data-length: 123328
+  data-unused: 384
+  $ f --sha256 .hg/store/00changelog-*.nd --size
+  .hg/store/00changelog-.nd: size=123328, 
sha256=10d26e9776b6596af0f89143a54eba8cc581e929c38242a02a7b0760698c6c70 (glob)
+
+#else
+  $ hg debugnodemap --metadata
+  uid:  (glob)
+  tip-rev: 5002
+  data-length: 122944
+  data-unused: 0
+  $ f --sha256 .hg/store/00changelog-*.nd --size
+  .hg/store/00changelog-.nd: size=122944, 
sha256=755976b22b64ab680401b45395953504e64e7fa8c31ac570f58dee21e15f9bc0 (glob)
+#endif
diff --git a/mercurial/revlogutils/nodemap.py b/mercurial/revlogutils/nodemap.py
--- a/mercurial/revlogutils/nodemap.py
+++ b/mercurial/revlogutils/nodemap.py
@@ -8,6 +8,7 @@
 
 from __future__ import absolute_import
 
+import errno
 import os
 import re
 import struct
@@ -45,11 +46,18 @@
 docket.data_unused = data_unused
 
 filename = _rawdata_filepath(revlog, docket)
-data = revlog.opener.tryread(filename)
+use_mmap = revlog.opener.options.get("exp-persistent-nodemap.mmap")
+try:
+with revlog.opener(filename) as fd:
+if use_mmap:
+data = util.buffer(util.mmapread(fd, data_length))
+else:
+data = fd.read(data_length)
+except OSError as e:
+if e.errno != errno.ENOENT:
+raise
 if len(data) < data_length:
 return None
-elif len(data) > data_length:
-data = data[:data_length]
 return docket, data
 
 
@@ -81,6 +89,8 @@
 
 can_incremental = util.safehasattr(revlog.index, 
"nodemap_data_incremental")
 ondisk_docket = revlog._nodemap_docket
+feed_data = util.safehasattr(revlog.index, "update_nodemap_data")
+use_mmap = revlog.opener.options.get("exp-persistent-nodemap.mmap")
 
 data = None
 # first attemp an incremental update of the data
@@ -97,12 +107,18 @@
 datafile = _rawdata_filepath(revlog, target_docket)
 # EXP-TODO: if this is a cache, this should use a cache vfs, not a
 # store vfs
+new_length = target_docket.data_length + len(data)
 with revlog.opener(datafile, b'r+') as fd:
 fd.seek(target_docket.data_length)
 fd.write(data)
-fd.seek(0)
-new_data = fd.read(target_docket.data_length + len(data))
-target_docket.data_length += len(data)
+if feed_data:
+if use_mmap:
+fd.seek(0)
+new_data = fd.read(new_length)
+else:
+fd.flush()
+new_data = util.buffer(util.mmapread(fd, new_length))
+target_docket.data_length = new_length
 target_docket.data_unused += data_changed_count
 
 if data is None:
@@ -115,9 +131,14 @@
 data = persistent_data(revlog.index)
 # EXP-TODO: if this is a cache, this should use a cache vfs, not a
 # store vfs
-new_data = data
-with revlog.opener(datafile, b'w') as fd:
+with revlog.opener(datafile, b'w+') as fd:
 fd.write(data)
+if feed_data:
+if use_mmap:
+new_data = data
+else:
+fd.flush()
+new_data = util.buffer(util.mmapread(fd, len(data)))
 target_docket.data_length = len(data)
 target_docket.tip_rev = revlog.tiprev()
 # EXP-TODO: if this is a cache, this should use a cache vfs, not a
@@ -125,7 +146,7 @@
 with revlog.opener(revlog.nodemap_file, b'w', atomictemp=True) as fp:

[Bug 6268] New: hgweb with py3 error when enabling demandimport

2020-02-11 Thread mercurial-bugs
https://bz.mercurial-scm.org/show_bug.cgi?id=6268

Bug ID: 6268
   Summary: hgweb with py3 error when enabling demandimport
   Product: Mercurial
   Version: 5.3
  Hardware: PC
OS: Linux
Status: UNCONFIRMED
  Severity: bug
  Priority: wish
 Component: hgweb
  Assignee: bugzi...@mercurial-scm.org
  Reporter: philippe.pep...@logilab.fr
CC: mercurial-devel@mercurial-scm.org
Python Version: ---

Hi,

We have an issue when upgrading our mercurial hgweb server from 5.2.2 to 5.3

Here is how to reproduce the issue with gunicorn wsgi server:

# cat << EOF > hgweb.config
[paths]
/ = *
EOF
# cat << EOF > hgweb.py
from mercurial import demandimport
demandimport.enable()
from mercurial.hgweb import hgweb
application = hgweb(b"/tmp/hgweb.config")
EOF
# gunicorn3 hgweb:application
[2020-02-11 15:03:17 +0100] [23266] [INFO] Starting gunicorn 19.9.0
[2020-02-11 15:03:17 +0100] [23266] [INFO] Listening at: http://127.0.0.1:8000
(23266)
[2020-02-11 15:03:17 +0100] [23266] [INFO] Using worker: sync
[2020-02-11 15:03:17 +0100] [23269] [INFO] Booting worker with pid: 23269
[2020-02-11 15:03:17 +0100] [23269] [ERROR] Exception in worker process
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/gunicorn/arbiter.py", line 583, in
spawn_worker
worker.init_process()
  File "/usr/lib/python3/dist-packages/gunicorn/workers/base.py", line 129, in
init_process
self.load_wsgi()
  File "/usr/lib/python3/dist-packages/gunicorn/workers/base.py", line 138, in
load_wsgi
self.wsgi = self.app.wsgi()
  File "/usr/lib/python3/dist-packages/gunicorn/app/base.py", line 67, in wsgi
self.callable = self.load()
  File "/usr/lib/python3/dist-packages/gunicorn/app/wsgiapp.py", line 52, in
load
return self.load_wsgiapp()
  File "/usr/lib/python3/dist-packages/gunicorn/app/wsgiapp.py", line 41, in
load_wsgiapp
return util.import_app(self.app_uri)
  File "/usr/lib/python3/dist-packages/gunicorn/util.py", line 375, in
import_app
__import__(module)
  File "/tmp/hgweb.py", line 4, in 
application = hgweb(b"/tmp/hgweb.config")
  File "/usr/lib/python3/dist-packages/mercurial/hgweb/__init__.py", line 50,
in hgweb
return hgwebdir_mod.hgwebdir(config, baseui=baseui)
  File "/usr/lib/python3.7/importlib/util.py", line 245, in __getattribute__
self.__spec__.loader.exec_module(self)
  File "/usr/lib/python3/dist-packages/mercurial/hgweb/hgwebdir_mod.py", line
17, in 
from .common import (
ImportError: cannot import name 'ErrorResponse' from 'mercurial.hgweb.common'
(/usr/lib/python3/dist-packages/mercurial/hgweb/common.py)
[2020-02-11 15:03:17 +0100] [23269] [INFO] Worker exiting (pid: 23269)
[2020-02-11 15:03:17 +0100] [23266] [INFO] Shutting down: Master
[2020-02-11 15:03:17 +0100] [23266] [INFO] Reason: Worker failed to boot.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7925: rust-matchers: add `IgnoreMatcher`

2020-02-11 Thread Raphaël Gomès
Alphare updated this revision to Diff 20163.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7925?vs=20043=20163

BRANCH
  default

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7925/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7925

AFFECTED FILES
  rust/hg-core/src/matchers.rs

CHANGE DETAILS

diff --git a/rust/hg-core/src/matchers.rs b/rust/hg-core/src/matchers.rs
--- a/rust/hg-core/src/matchers.rs
+++ b/rust/hg-core/src/matchers.rs
@@ -10,14 +10,25 @@
 #[cfg(feature = "with-re2")]
 use crate::re2::Re2;
 use crate::{
-filepatterns::{build_single_regex, PatternResult},
-utils::hg_path::{HgPath, HgPathBuf},
-DirsMultiset, DirstateMapError, IgnorePattern, PatternError,
+dirstate::dirs_multiset::DirsChildrenMultiset,
+filepatterns::{
+build_single_regex, filter_subincludes, get_patterns_from_file,
+PatternFileWarning, PatternResult, SubInclude,
+},
+utils::{
+files::find_dirs,
+hg_path::{HgPath, HgPathBuf},
+Escaped,
+},
+DirsMultiset, DirstateMapError, FastHashMap, IgnorePattern, PatternError,
 PatternSyntax,
 };
+
 use std::collections::HashSet;
+use std::fmt::{Display, Error, Formatter};
 use std::iter::FromIterator;
 use std::ops::Deref;
+use std::path::Path;
 
 #[derive(Debug, PartialEq)]
 pub enum VisitChildrenSet<'a> {
@@ -223,6 +234,88 @@
 }
 }
 
+/// Matches files that are included in the ignore rules.
+///
+#[cfg_attr(
+feature = "with-re2",
+doc = r##"
+```
+use hg::{
+matchers::{IncludeMatcher, Matcher},
+IgnorePattern,
+PatternSyntax,
+utils::hg_path::HgPath
+};
+use std::path::Path;
+///
+let ignore_patterns =
+vec![IgnorePattern::new(PatternSyntax::RootGlob, b"this*", Path::new(""))];
+let (matcher, _) = IncludeMatcher::new(ignore_patterns, "").unwrap();
+///
+assert_eq!(matcher.matches(HgPath::new(b"testing")), false);
+assert_eq!(matcher.matches(HgPath::new(b"this should work")), true);
+assert_eq!(matcher.matches(HgPath::new(b"this also")), true);
+assert_eq!(matcher.matches(HgPath::new(b"but not this")), false);
+```
+"##
+)]
+pub struct IncludeMatcher<'a> {
+patterns: Vec,
+match_fn: Box Fn(&'r HgPath) -> bool + 'a + Sync>,
+/// Whether all the patterns match a prefix (i.e. recursively)
+prefix: bool,
+roots: HashSet,
+dirs: HashSet,
+parents: HashSet,
+}
+
+impl<'a> Matcher for IncludeMatcher<'a> {
+fn file_set() -> Option<<>> {
+None
+}
+
+fn exact_match(, _filename: impl AsRef) -> bool {
+false
+}
+
+fn matches(, filename: impl AsRef) -> bool {
+(self.match_fn)(filename.as_ref())
+}
+
+fn visit_children_set(
+,
+directory: impl AsRef,
+) -> VisitChildrenSet {
+let dir = directory.as_ref();
+if self.prefix && self.roots.contains(dir) {
+return VisitChildrenSet::Recursive;
+}
+if self.roots.contains(HgPath::new(b""))
+|| self.roots.contains(dir)
+|| self.dirs.contains(dir)
+|| find_dirs(dir).any(|parent_dir| self.roots.contains(parent_dir))
+{
+return VisitChildrenSet::This;
+}
+
+if self.parents.contains(directory.as_ref()) {
+let multiset = self.get_all_parents_children();
+if let Some(children) = multiset.get(dir) {
+return VisitChildrenSet::Set(children.to_owned());
+}
+}
+VisitChildrenSet::Empty
+}
+
+fn matches_everything() -> bool {
+false
+}
+
+fn is_exact() -> bool {
+false
+}
+}
+
 #[cfg(feature = "with-re2")]
 /// Returns a function that matches an `HgPath` against the given regex
 /// pattern.
@@ -361,6 +454,175 @@
 })
 }
 
+/// Returns a function that checks whether a given file (in the general sense)
+/// should be matched.
+fn build_match<'a, 'b>(
+ignore_patterns: &'a [IgnorePattern],
+root_dir: impl AsRef,
+) -> PatternResult<(
+Vec,
+Box bool + 'b + Sync>,
+Vec,
+)> {
+let mut match_funcs: Vec bool + Sync>> = vec![];
+// For debugging and printing
+let mut patterns = vec![];
+let mut all_warnings = vec![];
+
+let (subincludes, ignore_patterns) =
+filter_subincludes(ignore_patterns, root_dir)?;
+
+if !subincludes.is_empty() {
+// Build prefix-based matcher functions for subincludes
+let mut submatchers = FastHashMap::default();
+let mut prefixes = vec![];
+
+for SubInclude { prefix, root, path } in subincludes.into_iter() {
+let (match_fn, warnings) = get_ignore_function(&[path], root)?;
+all_warnings.extend(warnings);
+prefixes.push(prefix.to_owned());
+submatchers.insert(prefix.to_owned(), match_fn);
+}
+
+let match_subinclude = move |filename: | {
+for prefix in prefixes.iter() {
+if let Some(rel) = 

D8086: rust-status: refactor options into a `StatusOptions` struct

2020-02-11 Thread Raphaël Gomès
Alphare updated this revision to Diff 20164.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D8086?vs=20155=20164

BRANCH
  default

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D8086/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D8086

AFFECTED FILES
  rust/hg-core/src/dirstate/status.rs
  rust/hg-core/src/lib.rs
  rust/hg-core/src/matchers.rs

CHANGE DETAILS

diff --git a/rust/hg-core/src/matchers.rs b/rust/hg-core/src/matchers.rs
--- a/rust/hg-core/src/matchers.rs
+++ b/rust/hg-core/src/matchers.rs
@@ -669,7 +669,11 @@
 
 assert_eq!(
 roots_dirs_and_parents().unwrap(),
-RootsDirsAndParents {roots, dirs, parents}
+RootsDirsAndParents {
+roots,
+dirs,
+parents
+}
 );
 }
 
diff --git a/rust/hg-core/src/lib.rs b/rust/hg-core/src/lib.rs
--- a/rust/hg-core/src/lib.rs
+++ b/rust/hg-core/src/lib.rs
@@ -13,7 +13,7 @@
 dirs_multiset::{DirsMultiset, DirsMultisetIter},
 dirstate_map::DirstateMap,
 parsers::{pack_dirstate, parse_dirstate, PARENT_SIZE},
-status::{status, StatusResult},
+status::{status, StatusOptions, StatusResult},
 CopyMap, CopyMapIter, DirstateEntry, DirstateParents, EntryState,
 StateMap, StateMapIter,
 };
diff --git a/rust/hg-core/src/dirstate/status.rs 
b/rust/hg-core/src/dirstate/status.rs
--- a/rust/hg-core/src/dirstate/status.rs
+++ b/rust/hg-core/src/dirstate/status.rs
@@ -83,9 +83,7 @@
 entry: DirstateEntry,
 metadata: HgMetadata,
 copy_map: ,
-check_exec: bool,
-list_clean: bool,
-last_normal_time: i64,
+options: StatusOptions,
 ) -> Dispatch {
 let DirstateEntry {
 state,
@@ -105,7 +103,7 @@
 EntryState::Normal => {
 let size_changed = mod_compare(size, st_size as i32);
 let mode_changed =
-(mode ^ st_mode as i32) & 0o100 != 0o000 && check_exec;
+(mode ^ st_mode as i32) & 0o100 != 0o000 && options.check_exec;
 let metadata_changed = size >= 0 && (size_changed || mode_changed);
 let other_parent = size == SIZE_FROM_OTHER_PARENT;
 if metadata_changed
@@ -115,14 +113,14 @@
 Dispatch::Modified
 } else if mod_compare(mtime, st_mtime as i32) {
 Dispatch::Unsure
-} else if st_mtime == last_normal_time {
+} else if st_mtime == options.last_normal_time {
 // the file may have just been marked as normal and
 // it may have changed in the same second without
 // changing its size. This can happen if we quickly
 // do multiple commits. Force lookup, so we don't
 // miss such a racy file change.
 Dispatch::Unsure
-} else if list_clean {
+} else if options.list_clean {
 Dispatch::Clean
 } else {
 Dispatch::Unknown
@@ -155,9 +153,7 @@
 files: &'a HashSet<>,
 dmap: &'a DirstateMap,
 root_dir: impl AsRef + Sync + Send,
-check_exec: bool,
-list_clean: bool,
-last_normal_time: i64,
+options: StatusOptions,
 ) -> impl ParallelIterator> {
 files.par_iter().filter_map(move |filename| {
 // TODO normalization
@@ -181,9 +177,7 @@
 *entry,
 HgMetadata::from_metadata(meta),
 _map,
-check_exec,
-list_clean,
-last_normal_time,
+options,
 ),
 )));
 }
@@ -206,14 +200,23 @@
 })
 }
 
+#[derive(Debug, Copy, Clone)]
+pub struct StatusOptions {
+/// Remember the most recent modification timeslot for status, to make
+/// sure we won't miss future size-preserving file content modifications
+/// that happen within the same timeslot.
+pub last_normal_time: i64,
+/// Whether we are on a filesystem with UNIX-like exec flags
+pub check_exec: bool,
+pub list_clean: bool,
+}
+
 /// Stat all entries in the `DirstateMap` and mark them for dispatch into
 /// the relevant collections.
 fn stat_dmap_entries(
 dmap: ,
 root_dir: impl AsRef + Sync + Send,
-check_exec: bool,
-list_clean: bool,
-last_normal_time: i64,
+options: StatusOptions,
 ) -> impl ParallelIterator> {
 dmap.par_iter().map(move |(filename, entry)| {
 let filename:  = filename;
@@ -234,9 +237,7 @@
 *entry,
 HgMetadata::from_metadata(m),
 _map,
-check_exec,
-list_clean,
-last_normal_time,
+options,
 ),
 )),
 Err(ref e)
@@ -303,31 

[PATCH STABLE] chgserver: spawn new process if schemes change

2020-02-11 Thread Yuya Nishihara
# HG changeset patch
# User Yuya Nishihara 
# Date 1581418436 -32400
#  Tue Feb 11 19:53:56 2020 +0900
# Branch stable
# Node ID 3dcf73d0733d87e7a231809d23da4883f4ab3711
# Parent  84d98fa814a8bab0dd94a133187dc9d42a5e433f
chgserver: spawn new process if schemes change

The schemes extension updates hg.schemes table. It's technically possible
for hg.repository() to look for e.g. ui.schemes instead of depending on
module-local table, but I don't think the change would make much sense
since [schemes] is usually specified in ~/.hgrc and thus it can be considered
static data.

diff --git a/mercurial/chgserver.py b/mercurial/chgserver.py
--- a/mercurial/chgserver.py
+++ b/mercurial/chgserver.py
@@ -83,6 +83,7 @@ def _hashlist(items):
 b'eol',  # uses setconfig('eol', ...)
 b'extdiff',  # uisetup will register new commands
 b'extensions',
+b'schemes',  # extsetup will update global hg.schemes
 ]
 
 _configsectionitems = [
diff --git a/tests/test-chg.t b/tests/test-chg.t
--- a/tests/test-chg.t
+++ b/tests/test-chg.t
@@ -245,6 +245,54 @@ is different when using py3):
   /MM/DD HH:MM:SS (PID)> worker process exited (pid=...)
   /MM/DD HH:MM:SS (PID)> $TESTTMP/extreload/chgsock/server-... is not 
owned, exiting.
 
+global data mutated by schems
+-
+
+  $ hg init schemes
+  $ cd schemes
+
+initial state
+
+  $ cat > .hg/hgrc <<'EOF'
+  > [extensions]
+  > schemes =
+  > [schemes]
+  > foo = https://foo.example.org/
+  > EOF
+  $ hg debugexpandscheme foo://expanded
+  https://foo.example.org/expanded
+  $ hg debugexpandscheme bar://unexpanded
+  bar://unexpanded
+
+add bar
+
+  $ cat > .hg/hgrc <<'EOF'
+  > [extensions]
+  > schemes =
+  > [schemes]
+  > foo = https://foo.example.org/
+  > bar = https://bar.example.org/
+  > EOF
+  $ hg debugexpandscheme foo://expanded
+  https://foo.example.org/expanded
+  $ hg debugexpandscheme bar://expanded
+  https://bar.example.org/expanded
+
+remove foo
+
+  $ cat > .hg/hgrc <<'EOF'
+  > [extensions]
+  > schemes =
+  > [schemes]
+  > bar = https://bar.example.org/
+  > EOF
+  $ hg debugexpandscheme foo://unexpanded
+  foo://unexpanded
+  $ hg debugexpandscheme bar://expanded
+  https://bar.example.org/expanded
+
+  $ cd ..
+
 repository cache
 
 
@@ -317,6 +365,8 @@ shut down servers and restore environmen
 check server log:
 
   $ cat log/server.log | filterlog
+  /MM/DD HH:MM:SS (PID)> worker process exited (pid=...)
+  /MM/DD HH:MM:SS (PID)> worker process exited (pid=...)
   /MM/DD HH:MM:SS (PID)> init cached
   /MM/DD HH:MM:SS (PID)> id -R cached
   /MM/DD HH:MM:SS (PID)> loaded repo into cache: $TESTTMP/cached (in  ...s)
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7931: rust-status: use bare hg status fastpath from Python

2020-02-11 Thread Raphaël Gomès
Alphare updated this revision to Diff 20167.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7931?vs=20159=20167

BRANCH
  default

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7931/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7931

AFFECTED FILES
  mercurial/dirstate.py
  mercurial/match.py
  tests/test-subrepo-deep-nested-change.t

CHANGE DETAILS

diff --git a/tests/test-subrepo-deep-nested-change.t 
b/tests/test-subrepo-deep-nested-change.t
--- a/tests/test-subrepo-deep-nested-change.t
+++ b/tests/test-subrepo-deep-nested-change.t
@@ -355,6 +355,11 @@
   R sub1/sub2/folder/test.txt
   ! sub1/.hgsub
   ? sub1/x.hgsub
+  $ hg status -R sub1
+  warning: subrepo spec file 'sub1/.hgsub' not found
+  R .hgsubstate
+  ! .hgsub
+  ? x.hgsub
   $ mv sub1/x.hgsub sub1/.hgsub
   $ hg update -Cq
   $ touch sub1/foo
diff --git a/mercurial/match.py b/mercurial/match.py
--- a/mercurial/match.py
+++ b/mercurial/match.py
@@ -666,7 +666,10 @@
 class includematcher(basematcher):
 def __init__(self, root, kindpats, badfn=None):
 super(includematcher, self).__init__(badfn)
-
+if rustmod is not None:
+# We need to pass the patterns to Rust because they can contain
+# patterns from the user interface
+self._kindpats = kindpats
 self._pats, self.matchfn = _buildmatch(kindpats, b'(?:/|$)', root)
 self._prefix = _prefix(kindpats)
 roots, dirs, parents = _rootsdirsandparents(kindpats)
diff --git a/mercurial/dirstate.py b/mercurial/dirstate.py
--- a/mercurial/dirstate.py
+++ b/mercurial/dirstate.py
@@ -27,6 +27,7 @@
 policy,
 pycompat,
 scmutil,
+sparse,
 txnutil,
 util,
 )
@@ -1083,7 +1084,7 @@
 results[next(iv)] = st
 return results
 
-def _rust_status(self, matcher, list_clean):
+def _rust_status(self, matcher, list_clean, list_ignored, list_unknown):
 # Force Rayon (Rust parallelism library) to respect the number of
 # workers. This is a temporary workaround until Rust code knows
 # how to read the config file.
@@ -1101,16 +1102,45 @@
 added,
 removed,
 deleted,
+clean,
+ignored,
 unknown,
-clean,
+warnings,
+bad,
 ) = rustmod.status(
 self._map._rustmap,
 matcher,
 self._rootdir,
-bool(list_clean),
+self._ignorefiles(),
+self._checkexec,
 self._lastnormaltime,
-self._checkexec,
+bool(list_clean),
+bool(list_ignored),
+bool(list_unknown),
 )
+if self._ui.warn:
+for item in warnings:
+if isinstance(item, tuple):
+file_path, syntax = item
+msg = _(b"%s: ignoring invalid syntax '%s'\n") % (
+file_path,
+syntax,
+)
+self._ui.warn(msg)
+else:
+msg = _(b"skipping unreadable pattern file '%s': %s\n")
+self._ui.warn(
+msg
+% (
+pathutil.canonpath(
+self._rootdir, self._rootdir, item
+),
+b"No such file or directory",
+)
+)
+
+for (fn, message) in bad:
+matcher.bad(fn, encoding.strtolocal(message))
 
 status = scmutil.status(
 modified=modified,
@@ -1118,7 +1148,7 @@
 removed=removed,
 deleted=deleted,
 unknown=unknown,
-ignored=[],
+ignored=ignored,
 clean=clean,
 )
 return (lookup, status)
@@ -1148,26 +1178,35 @@
 
 use_rust = True
 
-allowed_matchers = (matchmod.alwaysmatcher, matchmod.exactmatcher)
+allowed_matchers = (
+matchmod.alwaysmatcher,
+matchmod.exactmatcher,
+matchmod.includematcher,
+)
 
 if rustmod is None:
 use_rust = False
+elif self._checkcase:
+# Case-insensitive filesystems are not handled yet
+use_rust = False
 elif subrepos:
 use_rust = False
-elif bool(listunknown):
-# Pathauditor does not exist yet in Rust, unknown files
-# can't be trusted.
+elif sparse.enabled:
 use_rust = False
-elif self._ignorefiles() and listignored:
-# Rust has no ignore mechanism yet, so don't use Rust for
-# commands that need ignore.
+elif match.traversedir is not None:
 use_rust = False
 elif not isinstance(match, allowed_matchers):
 # Matchers have 

D7972: recover: don't verify by default

2020-02-11 Thread valentin.gatienbaron (Valentin Gatien-Baron)
valentin.gatienbaron added a comment.


  > I _think_ it's just paranoia. As long as the bundle wasn't woefully 
corrupt, it shouldn't be a problem. I _think_ if we set some of the 
[server]-section bundle validation options (which should be cheap enough) we 
could ditch this completely safely.
  > As it stands, I'm fine with this patch if someone else has the confidence 
to push it.
  
  How does the validity of an input bundle affect recover? I would have thought 
it's only the validity of the journal that matters, and that's created entirely 
based on local data (file lengths or contents before writes).
  
  Now I suppose the journal itself may well be truncated or not written at all 
when running out of disk space or other error situations where the OS does the 
writes out of order.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7972/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7972

To: valentin.gatienbaron, #hg-reviewers, marmoute, durin42
Cc: durin42, marmoute, pulkit, mjpieters, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7631: absorb: allowing committed changes to be absorbed into their ancestors

2020-02-11 Thread martinvonz (Martin von Zweigbergk)
This revision now requires changes to proceed.
martinvonz added a comment.
martinvonz requested changes to this revision.


  Sorry about the delay in responding. Please remember to rebase the series to 
the latest @ on hg-committed (the release notes are otherwise likely to 
conflict).

INLINE COMMENTS

> rdamazio wrote in absorb.py:993
> See the child commit (D7630 ), which 
> adds the "evolve" operation.
> 
> Because of the invariant about parent phases, checking that the revision 
> being absorbed is not public also ensures that everything it's absorbing into 
> is not public. Is that what you were looking for? If the commit A being 
> absorbed is a draft and its parent is public, then absorb just won't find 
> anywhere to absorb the lines and will leave everything in A.
> 
> About setting obsmarkers from the absorbed commit into the targets, while 
> that's technically correct, I suspect it'll become a hard-to-navigate mess 
> which adds very little. Do you want me to add that?

I think we should at least have a TODO about adding them.

By the way, without the next patch's auto-evolve feature, I'm not sure we 
should add such markers. I think they would trick `hg evolve` into moving any 
descendant commits onto the topmost commit that was absorbed into, but that's 
probably not what the user wants (they probably want child commits to be moved 
onto the absorbed commit's parent).

> 5.3:4-7
> + * The `absorb` extension can now absorb existing changesets, in addition to
> +   the working directory changes, which continues to be the default unless
> +   `--source`/`-s` is specified.
>  

please revert (sorry about this annoying effect of copy detection)

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7631/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7631

To: rdamazio, #hg-reviewers, martinvonz
Cc: mharbison72, martinvonz, pulkit, quark, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7630: absorb: make the absorbed changeset be automatically "evolved"

2020-02-11 Thread martinvonz (Martin von Zweigbergk)
martinvonz added a comment.


  In D7630#118997 , @rdamazio 
wrote:
  
  >> In D7630#117270 , @marmoute 
wrote:
  >>
  >>> In D7630#115320 , @pulkit 
wrote:
  >>>
  >> This results in an empty commit which is not similar to what rebase or 
evolve will generally result in after `D7631` unless `ui.allowemptycommit=True` 
is set. I think good behavior is to obsolete the absorbed changeset in favour 
of either it's parent or one of the revs in which it was absorbed.
  >
  > I made a related comment on the parent patch before I realized that 
this patch adds obsmarker handling. My suggestion there was to mark all the 
commits that got absorbed into as successors, and if there's anything left in 
the absorbed commit, that would be yet another successor. Would that work?
  
   Yep, that sounds good.
  >>
  >> I'm fine with doing this, but is there an efficient way to detect that it 
became empty?
  >
  > And by "this" I meant I'm fine with making it disappear if allowemptycommit 
is False. I don't fully understand how markers help accomplish that.
  
  When you try to create an empty commit, you'll get a `None` back for the 
nodeid (from `repo.commitctx()`, IIRC), so check for that.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7630/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7630

To: rdamazio, #hg-reviewers
Cc: marmoute, pulkit, martinvonz, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: Odd httpheader=1024 required for Phabricator

2020-02-11 Thread Makarius
On 11/02/2020 03:20, Augie Fackler wrote:
> I guess I'm not sure what's going on here. 
> https://www.mercurial-scm.org/repo/hg/rev/5cda0ce05c42 is the revision that 
> introduced that, but I'm not sure why you need to do anything /to 
> phabricator/ unless it's trying (poorly) to pretend to be an hg server. Is it 
> not just blindly proxying the hg protocol from your hg binary?

Thanks. This hint has helped me to look in the right spots.

Phabricator essentially starts a command-line process to learn about the
server capabilities like this:

  echo capabilities | hg -R .../test-repo serve --stdio

It uses the result for its own http communication. In the output above, there
used to be httpheader=1024 until 4.0.2, but it has disappeared in 4.1.


See also the Phabricator sources:

https://github.com/phacility/phabricator/blob/master/src/applications/diffusion/controller/DiffusionServeController.php#L781

https://github.com/phacility/phabricator/blob/master/src/applications/diffusion/controller/DiffusionServeController.php#L838

https://github.com/phacility/phabricator/blob/master/src/applications/diffusion/protocol/DiffusionMercurialWireProtocol.php#L107

The latter contains further comments about slightly odd censorship of the
capabilities. This is also the place where the workaround is inserted, see
again https://isabelle-dev.sketis.net/T8


A bisection over the hg repository yields the following relevant changeset:

changeset:   30563:e118233172fe
user:Gregory Szorc 
date:Mon Nov 28 20:46:42 2016 -0800
files:   mercurial/wireproto.py tests/test-ssh-bundle1.t tests/test-ssh.t
description:
wireproto: only advertise HTTP-specific capabilities to HTTP peers (BC)

Previously, the capabilities list was protocol agnostic and we
advertised the same capabilities list to all clients, regardless of
transport protocol.

A few capabilities are specific to HTTP. I see no good reason why we
should advertise them to SSH clients. So this patch limits their
advertisement to HTTP clients.

This patch is BC, but SSH clients shouldn't be using the removed
capabilities so there should be no impact.


What means "BC"?

Maybe Phabricator is a good reason to keep the full information?

Do you think you can refine that for a future release of Mercurial?

In contrast, it is probably difficult to get a patch accepted by the
Phabricator project, because they are only using rather old hg 2.8.2 for their
main installation, and 2.6.2, 3.5.1 in their tests:
https://github.com/phacility/phabricator/blob/master/src/applications/diffusion/protocol/__tests__/DiffusionMercurialWireProtocolTests.php


Makarius
___
Mercurial mailing list
Mercurial@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial


Re: Odd httpheader=1024 required for Phabricator

2020-02-11 Thread Emile Snyder
On Tue, Feb 11, 2020 at 1:34 PM Makarius  wrote:

> ...
> This patch is BC, but SSH clients shouldn't be using the removed
> capabilities so there should be no impact.
>
>
> What means "BC"?
>

I suspect "backwards compatible"?
___
Mercurial mailing list
Mercurial@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial


D7630: absorb: make the absorbed changeset be automatically "evolved"

2020-02-11 Thread marmoute (Pierre-Yves David)
marmoute added a comment.


  It looks like this series is introducing UI change of the same kind  as the 
one @martinvonz is looking into for `hg copy`. I'll try to have a look at both 
of them tomorrow.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7630/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7630

To: rdamazio, #hg-reviewers
Cc: marmoute, pulkit, martinvonz, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel