[
https://issues.apache.org/jira/browse/CASSANDRA-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462097#comment-13462097
]
Sylvain Lebresne commented on CASSANDRA-4449:
---------------------------------------------
bq. Actually I don't see a reason to use something as heavyweight as MD5.
The advantage of using a hash of the query string as ID is that you only ever
store one prepared statement for a given query. Which does save memory in
practice because a node will be connected by many clients that will usually all
prepare the same set of queries. It also give you some protection against
buggy/crappy clients that re-prepared the same query again and again, though
that's a more minor point. As for the heavyweightness of MD5, I don't think
this matters in the case of prepared statements.
> Make prepared statement global rather than connection based
> -----------------------------------------------------------
>
> Key: CASSANDRA-4449
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4449
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Sylvain Lebresne
> Assignee: Sylvain Lebresne
> Labels: binary_protocol
> Fix For: 1.2.0 beta 2
>
> Attachments: 4449.txt, 4449-v2.txt
>
>
> Currently, prepared statements are connection based. A client can only use a
> prepared statement on the connection it prepared it on, and if you prepare
> the same prepared statement on multiple connections, we'll keep multiple
> times the same prepared statement. This is potentially inefficient but can
> also be fairly painful for client libraries with pool of connections (a.k.a
> all reasonable client library ever) as this means you need to make sure you
> prepare statement on every connection of the pool, including the connection
> that don't exist yet but might be created later.
> This ticket suggests making prepared statement global (at least for CQL3),
> i.e. move them out of ClientState. This will likely reduce the number of
> stored statement on a given node quite a bit, since it's very likely that all
> clients to a given node will prepare the same statements (and potentially on
> all of their connection with the node). And given that prepared statement
> identifiers are the hashCode() of the string, this should be fairly trivial.
> I will note that while I think using a hash of the string as identifier is a
> very good idea, I don't know if the default java hashCode() is good enough.
> If that's a concern, maybe we should use a safer (bug longer) hash like md5
> or sha1. But we'd better do that now.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira