On Thu, 16 Mar 2017, David Bremner wrote:
> Daniel Kahn Gillmor writes:
>
>> On Wed 2017-03-15 21:57:28 -0400, David Bremner wrote:
>>> The corresponding xapian document just gets more terms added to it,
>>> but this doesn't seem to break anything.
>>
We can't very well call it uuencode if it is going to filter other
things as well.
---
lib/index.cc | 92 +++-
1 file changed, 48 insertions(+), 44 deletions(-)
diff --git a/lib/index.cc b/lib/index.cc
index 02b35b81..3bb1ac1c 100644
---
To match things more complicated than fixed strings, we need states
with multiple out arrows.
---
lib/index.cc | 22 --
1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/lib/index.cc b/lib/index.cc
index 3bb1ac1c..fd66762c 100644
--- a/lib/index.cc
+++
Just drop all tags
---
lib/index.cc | 21 -
test/T680-html-indexing.sh | 5 -
2 files changed, 24 insertions(+), 2 deletions(-)
diff --git a/lib/index.cc b/lib/index.cc
index fd66762c..324e6e79 100644
--- a/lib/index.cc
+++ b/lib/index.cc
@@ -206,6 +206,22
The idea is to support more general types of filtering, based on
content type.
---
lib/index.cc | 13 -
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/lib/index.cc b/lib/index.cc
index 8c145540..1c04cc3d 100644
--- a/lib/index.cc
+++ b/lib/index.cc
@@ -56,6 +56,7 @@
We could add a second gmime filter subclass, but prefer to avoid
duplicating the boilerplate.
---
lib/index.cc | 14 --
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/lib/index.cc b/lib/index.cc
index 1c04cc3d..74a750b9 100644
--- a/lib/index.cc
+++ b/lib/index.cc
@@
Steven Allen pointed out [2] that the previous scanner [1] was a
little too simplistic. This version handles (or claims to) quoted
strings in attributes, which can apparently contain '>'and '<'
characters. This required generalizing the state machine runner a bit
[3] to handle states with
'quite' on IRC reported that notmuch new was grinding to a halt during
initial indexing, and we eventually narrowed the problem down to some
html parts with large embedded images. These cause the number of terms
added to the Xapian database to explode (the first 400 messages
generated 4.6M unique
We want to reuse the scanner definition with a different table
---
lib/index.cc | 81 +++-
1 file changed, 47 insertions(+), 34 deletions(-)
diff --git a/lib/index.cc b/lib/index.cc
index 74a750b9..02b35b81 100644
--- a/lib/index.cc
+++
David Bremner writes:
> We plan a sequence of ABI breaking changes. Put the SONAME change in a
> separate commit to make reordering easier.
I have pushed this series to master. I don't plan on bumping the SONAME
for every breakage before the next release, so if you are
David Bremner writes:
> Since this is an ABI breaking change, bump the SONAME.
pushed, although the SONAME bump was already there from the previous
series.
d
___
notmuch mailing list
notmuch@notmuchmail.org
David Bremner writes:
> From: David Bremner
> Subject: Re: RFC: drop html tags
> To: Steven Allen
> Date: Tue, 21 Mar 2017 14:03:10 -0300
>
> Steven Allen writes:
>
>> In the JavaScript regex format, I believe
This patch is good. notmuch now gets through my whole archive of 175k mails,
memory usage peaking at 430M.
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch
13 matches
Mail list logo