On 2020-05-03 00:15, Ryan Blenis wrote:
Thanks for the quick reply. Yes they're all different... but very
I can send you a copy of the output directly if you'd like, as I don't
want to expose a bunch of emails. But the gist is that many have
periods, dashes, or numbers preceding the actual email address, in
some cases as odd as "------@------.com", "----@hotmail.com ",
"-------@hotmail.com ", "-@..immature", "-@..khairy",
indeed, they are clearly bogus addresses.
I did a pilerget on the actual email and it contains one TO address (a
correct address, the only odd thing I noticed is that the TO address
is in all caps, and from looking at the DB/sphinx I know in some
places you were using "X" as a placeholder for periods.
I just migrated from 1.1.0 (since 2015) and set up a new server on
1.3.8 - so the issue would have been (at least with) 1.1.0. My
upgrade method was a bit of a mix after reading all your documentation
on upgrading and server migrations (since I was doing both), as well
as Bitbucket issues. I know it's been a while, but I really wanted to
take the time to understand Piler before running an upgrade and
potentially screw something up.
I looked for a changelog with a fix like this, but I didn't see a
formal one, and looking through all the commits since 2015 was going
to take a LONG time so I figured I'd ask here. I can't see if I still
get new rcpt rows for that id, as I tossed my backups of prior DB
versions on the old server already, but I can certainly pay attention
on the new server.
select id, COUNT(*) from rcpt GROUP BY id HAVING COUNT(*) > 50 ORDER
BY COUNT(*) ASC; shows that there are definitely a couple other id's
with this issue, and likely many more, but there is a steep dropoff
from 3 million to 30k to 200-500 range per ID. Maybe you (or other
users) can use that query to check the recipient count for each stored
email for similar issues.
I just did, and found 10 such emails. Some of them had even ~150
After checking those 10 emails I found that they are all legitimate in
a sense that the email really consisted of that many recipient, and the
rcpt table entries were valid email addresses.
So fortunately the issue is not present in my personal archive.
I just wanted to make sure that my assumption on the rcpt table was
correct (e.g. it should only have the "to" recipients for each email
matching on metadata.id ), so that I can delete all those rows and
yes, you are right.
either manually insert the address found from pilerget or delete the
matching metadata row and just re-import the email to fix the issue.
In the latter case you'll get two hits for the same message from sphinx,
and one of them would be a ghost, because you deleted its corresponding
On Sat, May 2, 2020 at 8:29 AM <s...@acts.hu> wrote:
this must be definitely a bug. What piler version do you use?
Try selecting the first 1000 rows for id=37, and check if the
recipients are actually all different.
Do you still get new rcpt rows for id=37?
I suspect that piler tries to keep processing the very same email.
On 2020-05-02 09:41, Ryan Blenis wrote:
Just a question about the table. I took a look at the data, and
got 23 million rows in there, however I kept seeing id 37 pop up.
queried for "SELECT COUNT(*) FROM rcpt WHERE id = 37" and got a
I was under the impression (perhaps mistakenly) that the table was
used to store recipients of the email linked on metadata.id 
 - but
certainly no email had 3 million recipients. Can you let me know
I'm missing here? Thank you.