On 18/10/2021 00.25, sebb wrote:
On Sun, 17 Oct 2021 at 23:19, <[email protected]> wrote:

This is an automated email from the ASF dual-hosted git repository.

humbedooh pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-ponymail-foal.git


The following commit(s) were added to refs/heads/master by this push:
      new d843003  +1 will suffice
d843003 is described below

commit d8430036d92e8a89c693277a7f5c5c4c262f352c
Author: Daniel Gruno <[email protected]>
AuthorDate: Mon Oct 18 00:19:40 2021 +0200

     +1 will suffice
---
  tools/archiver.py | 2 +-
  tools/migrate.py  | 3 ++-
  2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/archiver.py b/tools/archiver.py
index f14d7f6..93a29a8 100755
--- a/tools/archiver.py
+++ b/tools/archiver.py
@@ -588,7 +588,7 @@ class Archiver(object):  # N.B. Also used by import-mbox.py

              notes.append(["ARCHIVE: Email archived as %s at %u" % 
(document_id, time.time())])
              body_unflowed = body.unflow() if body else ""
-            body_shortened = body_unflowed[:SHORT_BODY_MAX_LEN+10]  # +10 so 
that we can tell if larger than std short body.
+            body_shortened = body_unflowed[:SHORT_BODY_MAX_LEN+1]  # +1 so 
that we can tell if larger than std short body.

              output_json = {
                  "from_raw": msg_metadata["from"],
diff --git a/tools/migrate.py b/tools/migrate.py
index 2493465..c46b8ef 100644
--- a/tools/migrate.py
+++ b/tools/migrate.py
@@ -201,7 +201,8 @@ def process_document(old_es, doc, old_dbname, 
dbname_source, dbname_mbox, do_dki
      doc["_source"]["dbid"] = hashlib.sha3_256(source_text).hexdigest()

      # Add in shortened body for search aggs
-    doc["_source"]["body_short"] = 
doc["_source"]["body"][:archiver.SHORT_BODY_MAX_LEN+10]
+    # We add +1 to know whether to use ellipsis in reports.
+    doc["_source"]["body_short"] = 
doc["_source"]["body"][:archiver.SHORT_BODY_MAX_LEN+1]

Why +1 here?

Cosmetic reasons, we need to know whether to add '...' to the body when potentially shortening it. If we cap at 200, we won't know if it's >200, so we cap at 200+1. Alternatively, we would add a new field that had either a bool (shortened or not) or add a body_length field that we could work with, but that seems a tad overkill. HTH


      # Add in gravatar
      header_from = doc["_source"]["from"]

Reply via email to