Sorry - I've been following along via Nabble whilst Gmail sort out my email,
and it looks like the entire discussions weren't showing up, so I'm only
seeing these replies now.


tabris wrote:
> 
>> You would be slowing down the main use case for single-instance storage.
>> Basically it would increase the amount of network IO even further for
>> the case where a given blob is already in the database. Your approach
>> will only speed up case where a given blob hasn't been seen before.
> 
>     True and well understood when I made the suggestion, but I don't
> think the increase in bandwidth will be too high, and on the average
> will decrease the bandwidth across the wire (you won't be, by the most
> common case, sending the blob across the wire twice).
> 

This is what made it seem like your solution was a win, win for me, because
of the assumption that 99% of the time, you wouldn't need to send the blob
across to the database. If 99% of the time means that we don't need to send
a 1MB blob across the network twice and cause the database to unnecessarily
load it once, I think the performance hit on a potential double query should
be fine?

Again like I send in my master/slave post, my analysis/work on DBMail at the
moment is mostly theoretical, backed by my local test setup here, which
would be nothing like a production environment. I don't have a big database
of legit mail where I can run queries to draw conclusions based on the
mimepart space savings, collisions etc.
-- 
View this message in context: 
http://old.nabble.com/blob_exists---selects-based-on-blob-as-well-as-hash--tp31470216p31474812.html
Sent from the dbmail dev mailing list archive at Nabble.com.

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev

Reply via email to