Hi Janos,

Thanks for the quick reply. Yes they're all different... but very oddly
formatted.

I can send you a copy of the output directly if you'd like, as I don't want
to expose a bunch of emails. But the gist is that many have periods,
dashes, or numbers preceding the actual email address, in some cases as odd
as "------@------.com", "----@hotmail.com", "-------@hotmail.com",
"-@..immature", "-@..khairy", "-@d.i.s.c.o", "......@www.hip",
"......redactedhotmailaddr...@hotmail.com..........me", "._.pitch@n._", etc.

I did a pilerget on the actual email and it contains one TO address (a
correct address, the only odd thing I noticed is that the TO address is in
all caps, and from looking at the DB/sphinx I know in some places you were
using "X" as a placeholder for periods.

I just migrated from 1.1.0 (since 2015) and set up a new server on 1.3.8 -
so the issue would have been (at least with) 1.1.0.  My upgrade method was
a bit of a mix after reading all your documentation on upgrading and server
migrations (since I was doing both), as well as Bitbucket issues. I know
it's been a while, but I really wanted to take the time to understand Piler
before running an upgrade and potentially screw something up.

I looked for a changelog with a fix like this, but I didn't see a formal
one, and looking through all the commits since 2015 was going to take a
LONG time so I figured I'd ask here. I can't see if I still get new rcpt
rows for that id, as I tossed my backups of prior DB versions on the old
server already, but I can certainly pay attention on the new server.

select id, COUNT(*) from rcpt GROUP BY id HAVING COUNT(*) > 50 ORDER BY
COUNT(*) ASC; shows that there are definitely a couple other id's with this
issue, and likely many more, but there is a steep dropoff from 3 million to
30k to 200-500 range per ID. Maybe you (or other users) can use that query
to check the recipient count for each stored email for similar issues.

I just wanted to make sure that my assumption on the rcpt table was correct
(e.g. it should only have the "to" recipients for each email matching on
metadata.id), so that I can delete all those rows and either manually
insert the address found from pilerget or delete the matching metadata row
and just re-import the email to fix the issue. Thanks in advance.


On Sat, May 2, 2020 at 8:29 AM <s...@acts.hu> wrote:

>
> Hello Ryan,
>
> this must be definitely a bug. What piler version do you use?
> Try selecting the first 1000 rows for id=37, and check if the
> recipients are actually all different.
>
> Do you still get new rcpt rows for id=37?
> I suspect that piler tries to keep processing the very same email.
>
> Janos
>
>
> On 2020-05-02 09:41, Ryan Blenis wrote:
> > Hello,
> >
> > Just a question about the table. I took a look at the data, and I've
> > got 23 million rows in there, however I kept seeing id 37 pop up. I
> > queried for "SELECT COUNT(*) FROM rcpt WHERE id = 37" and got a count
> > of 3,116,977.
> >
> > I was under the impression (perhaps mistakenly) that the table was
> > used to store recipients of the email linked on metadata.id [1] - but
> > certainly no email had 3 million recipients. Can you let me know what
> > I'm missing here? Thank you.
> >
>

Reply via email to