Re: Thanks for notmuch-lore

2022-03-22 Thread Carl Worth
On Tue, Mar 22 2022, Kyle Meyer wrote:
> I may be missing something (I didn't know about notmuch-lore before
> seeing it mentioned here), but it looks like the initialization step of
> notmuch-lore's pre-new handles that already.  You just need to set
> `since` far enough back:

Hmm... I did see the "since" parameter and cranked it back.

It didn't seem to do what I wanted, but it's possible the bug is only
with multi-epoch archives, (I was trying to bring in LKML).

From poking at it, it looked like it did perform a "deepening" operation
using the "since" parameter after the initial clone, but then didn't use
anything older than the most-recent upstream commit for the range of
commits from which to get messages out.

But my examination of the code and behavior was very cursory, I admit.

> Also, just to list some other options in this space, l2md and impibe are
> mentioned at  as tools for
> converting public-inbox archives into maildir format.  (I haven't used
> either myself.)

Thanks! I clearly didn't look quite hard enough. I appreciate the
pointers.

-Carl


signature.asc
Description: PGP signature
___
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-le...@notmuchmail.org


Re: Thanks for notmuch-lore

2022-03-21 Thread Kyle Meyer
Carl Worth writes:

> On Tue, Feb 01 2022, Tobias Waldekranz wrote:
>> I actually gave up on getting my mailinglists from my email provider,
>> now I just download it directly from lore. I hacked together a script
>> that will scrape a public-inbox repo and convert it to a Maildir:
>>
>> https://github.com/wkz/notmuch-lore
>
> Thanks for sharing this, Tobias. I needed exactly this today, and was
> happy to have found this.
>
> It looks like you've coded something to efficiently do the work that's
> needed periodically, (fetch new emails from the public-inbox git
> repository, convert them to maildir files, and prune away git state
> other than a pointer to what's been converted already).
>
> What I'm missing is the piece to convert over the entire archive from
> the past.

I may be missing something (I didn't know about notmuch-lore before
seeing it mentioned here), but it looks like the initialization step of
notmuch-lore's pre-new handles that already.  You just need to set
`since` far enough back:

--8<---cut here---start->8---
tmphome=$(mktemp -d "${TMPDIR:-/tmp}"/nm-lore-XXX)
cd "$tmphome"

HOME="$tmphome"
export HOME

mkdir mail
notmuch setup
notmuch new

mkdir -p mail/.notmuch/.lore  mail/.notmuch/hooks

cat >mail/.notmuch/.lore/sources <<'EOF'
[gwl]
url=https://yhetil.org/gwl/git
since=50 years ago
EOF

curl -fSsL \
 
https://raw.githubusercontent.com/wkz/notmuch-lore/3e2a13b32b178a4d3296cee6f69ee3491eebdb9f/pre-new
 \
 >mail/.notmuch/hooks/pre-new
chmod +x mail/.notmuch/hooks/pre-new
./mail/.notmuch/hooks/pre-new
--8<---cut here---end--->8---

That returns the number of messages I expect for that (small) archive:

  $ find mail/gwl -type f | wc -l
  288

Also, just to list some other options in this space, l2md and impibe are
mentioned at <https://public-inbox.org/clients.html> as tools for
converting public-inbox archives into maildir format.  (I haven't used
either myself.)

Tobias, just a note of something I saw when looking over the script:

$git rev-list $3 | while read sha; do
  $git show $sha:m >$db/$1/new/$sha
done

This would error if it encounters a deleted message in the archive
because then the commit will have a "d" in the working tree instead of
an "m".  See <https://public-inbox.org/public-inbox-v2-format.html>.
___
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-le...@notmuchmail.org


Thanks for notmuch-lore

2022-03-21 Thread Carl Worth
On Tue, Feb 01 2022, Tobias Waldekranz wrote:
> I actually gave up on getting my mailinglists from my email provider,
> now I just download it directly from lore. I hacked together a script
> that will scrape a public-inbox repo and convert it to a Maildir:
>
> https://github.com/wkz/notmuch-lore

Thanks for sharing this, Tobias. I needed exactly this today, and was
happy to have found this.

It looks like you've coded something to efficiently do the work that's
needed periodically, (fetch new emails from the public-inbox git
repository, convert them to maildir files, and prune away git state
other than a pointer to what's been converted already).

What I'm missing is the piece to convert over the entire archive from
the past.

I can fetch it all easily enough with public-inbox-clone. Maybe what I
want could be captured in a tool named something like:

public-inbox-export --output=maildir

After which I'd be all bootstrapped and ready to use your notmuch-lore
pre-new hook.

> As you can tell from the name, it is tailored for plugging into notmuch,
> but the guts are pretty generic.

Indeed. And it looks like all the code I would need for the export I
described above is right there in your script. It's as simple as:

git rev-list | while read sha; do
$git show $sha:m > $maildir/new/$sha
done

So, next I should go put together a patch against public-inbox to add
that.

Thanks again,

-Carl

PS. I debated whether to CC lkml where the original message I was
replying to was from originally. I decided against it and almost just
emailed Tobias alone, but I really do want discussion like this to be
archived in public. So I CCed the notmuch mailing list at least.


signature.asc
Description: PGP signature
___
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-le...@notmuchmail.org