Sorry - I've been following along via Nabble whilst Gmail sort out my email, and it looks like the entire discussions weren't showing up, so I'm only seeing these replies now.
tabris wrote: > >> You would be slowing down the main use case for single-instance storage. >> Basically it would increase the amount of network IO even further for >> the case where a given blob is already in the database. Your approach >> will only speed up case where a given blob hasn't been seen before. > > True and well understood when I made the suggestion, but I don't > think the increase in bandwidth will be too high, and on the average > will decrease the bandwidth across the wire (you won't be, by the most > common case, sending the blob across the wire twice). > This is what made it seem like your solution was a win, win for me, because of the assumption that 99% of the time, you wouldn't need to send the blob across to the database. If 99% of the time means that we don't need to send a 1MB blob across the network twice and cause the database to unnecessarily load it once, I think the performance hit on a potential double query should be fine? Again like I send in my master/slave post, my analysis/work on DBMail at the moment is mostly theoretical, backed by my local test setup here, which would be nothing like a production environment. I don't have a big database of legit mail where I can run queries to draw conclusions based on the mimepart space savings, collisions etc. -- View this message in context: http://old.nabble.com/blob_exists---selects-based-on-blob-as-well-as-hash--tp31470216p31474812.html Sent from the dbmail dev mailing list archive at Nabble.com. _______________________________________________ Dbmail-dev mailing list Dbmail-dev@dbmail.org http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev