Re: DataType protocol ID error for TIMESTAMPs when upgrading from 1.2.11 to 2.0.9
On Wed, Jul 23, 2014 at 1:53 AM, Robert Coli rc...@eventbrite.com wrote: On Tue, Jul 22, 2014 at 1:53 AM, Ben Hood 0x6e6...@gmail.com wrote: In this particular case, the answer to why not involves the idea that one needs to be able to test with a driver in order to expose it, and currently (as I understand it) only distributed tests use a driver. I believe that operators expect there to be a robust representative test schema that can be created on version X.Y.Z and be accessed on version X+1.y.0 which would exercise this core code and increase confidence that tables created in major version X will always be usable without exception in X+1. With gocql we currently run out integration test suite on Travis against 1.2.18 and 2.0.9 (*), but in each case, we install the server from a clean slate. Theoretically we could do a migration, but the clean slate makes things easier for us. One could argue however that verifying server migration is beyond the scope of the integration test suite for a client driver. (*) We've looked at including 2.1rc3, but there is a acknowledged server side bug that causes one of our tests to fail, so we do not have mainline coverage for 2.1-rcx yet.
Re: DataType protocol ID error for TIMESTAMPs when upgrading from 1.2.11 to 2.0.9
On Tue, Jul 22, 2014 at 1:26 AM, Robert Coli rc...@eventbrite.com wrote: I'm pretty sure reversed comparator timestamps are a common type of schema, given that there are blog posts recommending their use, so I struggle to understand how this was not detected by unit tests. As Karl has suggested, client driver maintainers have opted to workaround the issue. At gocql, when we ran into this issue, we began a discussion thread to see if this was likely to be a client side or a server side bug. Because we didn't get a response from the discussion, we thought that the most pragmatic thing to do was to implement a workaround in the client. Potentially other driver maintainers have taken a similar course of action. As for the unit tests, I think this issue was only reproducible when upgrading a schema to 2.0.x - are you suggesting that there was/is test coverage for this scenario in the server?
Re: DataType protocol ID error for TIMESTAMPs when upgrading from 1.2.11 to 2.0.9
On Tue, Jul 22, 2014 at 1:53 AM, Ben Hood 0x6e6...@gmail.com wrote: As Karl has suggested, client driver maintainers have opted to workaround the issue. Indeed, reading up on the issue (and discussing it with folks) there are a number of mitigating factors, most significantly driver workarounds use of TimeUUIDs, which made this issue less common than reversed comparators use cases are. I still consider it a serious issue due to the nature of the regression, but it is fair to say not as serious as my initial reaction. As for the unit tests, I think this issue was only reproducible when upgrading a schema to 2.0.x - are you suggesting that there was/is test coverage for this scenario in the server? No, I was wondering why such a test, which tests for regression in very basic table access and appears to requires no distribution, does not currently exist. In this particular case, the answer to why not involves the idea that one needs to be able to test with a driver in order to expose it, and currently (as I understand it) only distributed tests use a driver. I believe that operators expect there to be a robust representative test schema that can be created on version X.Y.Z and be accessed on version X+1.y.0 which would exercise this core code and increase confidence that tables created in major version X will always be usable without exception in X+1. =Rob
Re: DataType protocol ID error for TIMESTAMPs when upgrading from 1.2.11 to 2.0.9
On Sat, Jul 19, 2014 at 7:35 PM, Karl Rieb karl.r...@gmail.com wrote: Can now be followed at: https://issues.apache.org/jira/browse/CASSANDRA-7576. Nice work! Finally we have a proper solution to this issue, so well done to you.
Re: DataType protocol ID error for TIMESTAMPs when upgrading from 1.2.11 to 2.0.9
On Mon, Jul 21, 2014 at 1:58 AM, Ben Hood 0x6e6...@gmail.com wrote: On Sat, Jul 19, 2014 at 7:35 PM, Karl Rieb karl.r...@gmail.com wrote: Can now be followed at: https://issues.apache.org/jira/browse/CASSANDRA-7576. Nice work! Finally we have a proper solution to this issue, so well done to you. For reference, I consider this issue of sufficient severity to recommend against upgrading to any version of 2.0 before 2.0.10, unless you are certain you have no such schema. I'm pretty sure reversed comparator timestamps are a common type of schema, given that there are blog posts recommending their use, so I struggle to understand how this was not detected by unit tests. Does your fix add unit tests which would catch this case on upgrade? =Rob
Re: DataType protocol ID error for TIMESTAMPs when upgrading from 1.2.11 to 2.0.9
I did not include unit tests in my patch. I think many people did not run into this issue because many Cassandra clients handle the DateType when found as a CUSTOM type. -Karl On Jul 21, 2014, at 8:26 PM, Robert Coli rc...@eventbrite.com wrote: On Mon, Jul 21, 2014 at 1:58 AM, Ben Hood 0x6e6...@gmail.com wrote: On Sat, Jul 19, 2014 at 7:35 PM, Karl Rieb karl.r...@gmail.com wrote: Can now be followed at: https://issues.apache.org/jira/browse/CASSANDRA-7576. Nice work! Finally we have a proper solution to this issue, so well done to you. For reference, I consider this issue of sufficient severity to recommend against upgrading to any version of 2.0 before 2.0.10, unless you are certain you have no such schema. I'm pretty sure reversed comparator timestamps are a common type of schema, given that there are blog posts recommending their use, so I struggle to understand how this was not detected by unit tests. Does your fix add unit tests which would catch this case on upgrade? =Rob
Re: DataType protocol ID error for TIMESTAMPs when upgrading from 1.2.11 to 2.0.9
Ben! I think I have an idea of exactly where the bug is! I did some more searching and discovered the difference that causes some tables to produce the wrong type and others to be okay: *the tables with the wrong type reverse the ordering of the timestamp column*. The bug is in org.apache.cassandra.transport.DataType:fromType(AbstractType) : public static PairDataType, Object fromType(AbstractType type) { // For CQL3 clients, ReversedType is an implementation detail and they // shouldn't have to care about it. if (type instanceof ReversedType) type = ((ReversedType)type).baseType; // For compatibility sake, we still return DateType as the timestamp type in resultSet metadata (#5723) else if (type instanceof DateType) type = TimestampType.instance; DataType dt = dataTypeMap.get(type); if (dt == null) { if (type.isCollection()) { if (type instanceof ListType) { return Pair.DataType, Objectcreate(LIST, ((ListType)type).elements); } else if (type instanceof MapType) { MapType mt = (MapType)type; return Pair.DataType, Objectcreate(MAP, Arrays.asList(mt.keys, mt.values)); } else { assert type instanceof SetType; return Pair.DataType, Objectcreate(SET, ((SetType)type).elements); } } return Pair.DataType, Objectcreate(CUSTOM, type.toString()); } else { return Pair.create(dt, null); } } The issue is the else if, which does not check the base type of the reversed column: if (type instanceof ReversedType) type = ((ReversedType)type).baseType; // For compatibility sake, we still return DateType as the timestamp type in resultSet metadata (#5723) *else if* (type instanceof DateType) type = TimestampType.instance; The else should be removed to make it just: if (type instanceof ReversedType) type = ((ReversedType)type).baseType; // For compatibility sake, we still return DateType as the timestamp type in resultSet metadata (#5723) *if* (type instanceof DateType) type = TimestampType.instance; This way we do a check for DataType on the base type of reversed columns! I applied the fix to my 2.0.9 cassandra node and the errors go away! Could you guys please make this single-word fix? -Karl On Fri, Jul 18, 2014 at 1:30 PM, Ben Hood 0x6e6...@gmail.com wrote: On Fri, Jul 18, 2014 at 3:03 PM, Karl Rieb karl.r...@gmail.com wrote: Why is the protocol ID correct for some tables but not others? I have no idea. Why does it work when I do a clean install on a new 2.0.x cluster? I still have no idea. The bug seems to be on the Cassandra side and the clients seem to just be providing patches to these issues. It was reported to the Cassandra list, but there was no answer, potentially because the query was sent to the wrong list, but I don't really know. Maybe it should have gone into Jira, but it's unclear as to whether this is a client or a server issue. In any case, it didn't look like the server behavior was going to change any time soon, so we just took the pragmatic approach in gocql and worked around the issue. I will post to the Datastax java driver mailing list and see if they are willing to add a patch. That sounds like a good idea, seeing as the workaround has been tested before. Sorry to be of little help to you.
Re: DataType protocol ID error for TIMESTAMPs when upgrading from 1.2.11 to 2.0.9
Can you submit a ticket in C* JIRA at issues.apache.org? -- Sent from my iPhone Am 19.07.2014 um 16:45 schrieb Karl Rieb karl.r...@gmail.com: Ben! I think I have an idea of exactly where the bug is! I did some more searching and discovered the difference that causes some tables to produce the wrong type and others to be okay: the tables with the wrong type reverse the ordering of the timestamp column. The bug is in org.apache.cassandra.transport.DataType:fromType(AbstractType): public static PairDataType, Object fromType(AbstractType type) { // For CQL3 clients, ReversedType is an implementation detail and they // shouldn't have to care about it. if (type instanceof ReversedType) type = ((ReversedType)type).baseType; // For compatibility sake, we still return DateType as the timestamp type in resultSet metadata (#5723) else if (type instanceof DateType) type = TimestampType.instance; DataType dt = dataTypeMap.get(type); if (dt == null) { if (type.isCollection()) { if (type instanceof ListType) { return Pair.DataType, Objectcreate(LIST, ((ListType)type).elements); } else if (type instanceof MapType) { MapType mt = (MapType)type; return Pair.DataType, Objectcreate(MAP, Arrays.asList(mt.keys, mt.values)); } else { assert type instanceof SetType; return Pair.DataType, Objectcreate(SET, ((SetType)type).elements); } } return Pair.DataType, Objectcreate(CUSTOM, type.toString()); } else { return Pair.create(dt, null); } } The issue is the else if, which does not check the base type of the reversed column: if (type instanceof ReversedType) type = ((ReversedType)type).baseType; // For compatibility sake, we still return DateType as the timestamp type in resultSet metadata (#5723) else if (type instanceof DateType) type = TimestampType.instance; The else should be removed to make it just: if (type instanceof ReversedType) type = ((ReversedType)type).baseType; // For compatibility sake, we still return DateType as the timestamp type in resultSet metadata (#5723) if (type instanceof DateType) type = TimestampType.instance; This way we do a check for DataType on the base type of reversed columns! I applied the fix to my 2.0.9 cassandra node and the errors go away! Could you guys please make this single-word fix? -Karl On Fri, Jul 18, 2014 at 1:30 PM, Ben Hood 0x6e6...@gmail.com wrote: On Fri, Jul 18, 2014 at 3:03 PM, Karl Rieb karl.r...@gmail.com wrote: Why is the protocol ID correct for some tables but not others? I have no idea. Why does it work when I do a clean install on a new 2.0.x cluster? I still have no idea. The bug seems to be on the Cassandra side and the clients seem to just be providing patches to these issues. It was reported to the Cassandra list, but there was no answer, potentially because the query was sent to the wrong list, but I don't really know. Maybe it should have gone into Jira, but it's unclear as to whether this is a client or a server issue. In any case, it didn't look like the server behavior was going to change any time soon, so we just took the pragmatic approach in gocql and worked around the issue. I will post to the Datastax java driver mailing list and see if they are willing to add a patch. That sounds like a good idea, seeing as the workaround has been tested before. Sorry to be of little help to you.
Re: DataType protocol ID error for TIMESTAMPs when upgrading from 1.2.11 to 2.0.9
Will do! On Jul 19, 2014, at 11:22 AM, Robert Stupp sn...@snazy.de wrote: Can you submit a ticket in C* JIRA at issues.apache.org? -- Sent from my iPhone Am 19.07.2014 um 16:45 schrieb Karl Rieb karl.r...@gmail.com: Ben! I think I have an idea of exactly where the bug is! I did some more searching and discovered the difference that causes some tables to produce the wrong type and others to be okay: the tables with the wrong type reverse the ordering of the timestamp column. The bug is in org.apache.cassandra.transport.DataType:fromType(AbstractType): public static PairDataType, Object fromType(AbstractType type) { // For CQL3 clients, ReversedType is an implementation detail and they // shouldn't have to care about it. if (type instanceof ReversedType) type = ((ReversedType)type).baseType; // For compatibility sake, we still return DateType as the timestamp type in resultSet metadata (#5723) else if (type instanceof DateType) type = TimestampType.instance; DataType dt = dataTypeMap.get(type); if (dt == null) { if (type.isCollection()) { if (type instanceof ListType) { return Pair.DataType, Objectcreate(LIST, ((ListType)type).elements); } else if (type instanceof MapType) { MapType mt = (MapType)type; return Pair.DataType, Objectcreate(MAP, Arrays.asList(mt.keys, mt.values)); } else { assert type instanceof SetType; return Pair.DataType, Objectcreate(SET, ((SetType)type).elements); } } return Pair.DataType, Objectcreate(CUSTOM, type.toString()); } else { return Pair.create(dt, null); } } The issue is the else if, which does not check the base type of the reversed column: if (type instanceof ReversedType) type = ((ReversedType)type).baseType; // For compatibility sake, we still return DateType as the timestamp type in resultSet metadata (#5723) else if (type instanceof DateType) type = TimestampType.instance; The else should be removed to make it just: if (type instanceof ReversedType) type = ((ReversedType)type).baseType; // For compatibility sake, we still return DateType as the timestamp type in resultSet metadata (#5723) if (type instanceof DateType) type = TimestampType.instance; This way we do a check for DataType on the base type of reversed columns! I applied the fix to my 2.0.9 cassandra node and the errors go away! Could you guys please make this single-word fix? -Karl On Fri, Jul 18, 2014 at 1:30 PM, Ben Hood 0x6e6...@gmail.com wrote: On Fri, Jul 18, 2014 at 3:03 PM, Karl Rieb karl.r...@gmail.com wrote: Why is the protocol ID correct for some tables but not others? I have no idea. Why does it work when I do a clean install on a new 2.0.x cluster? I still have no idea. The bug seems to be on the Cassandra side and the clients seem to just be providing patches to these issues. It was reported to the Cassandra list, but there was no answer, potentially because the query was sent to the wrong list, but I don't really know. Maybe it should have gone into Jira, but it's unclear as to whether this is a client or a server issue. In any case, it didn't look like the server behavior was going to change any time soon, so we just took the pragmatic approach in gocql and worked around the issue. I will post to the Datastax java driver mailing list and see if they are willing to add a patch. That sounds like a good idea, seeing as the workaround has been tested before. Sorry to be of little help to you.
Re: DataType protocol ID error for TIMESTAMPs when upgrading from 1.2.11 to 2.0.9
Can now be followed at: https://issues.apache.org/jira/browse/CASSANDRA-7576 . On Sat, Jul 19, 2014 at 1:03 PM, Karl Rieb karl.r...@gmail.com wrote: Will do! On Jul 19, 2014, at 11:22 AM, Robert Stupp sn...@snazy.de wrote: Can you submit a ticket in C* JIRA at issues.apache.org? -- Sent from my iPhone Am 19.07.2014 um 16:45 schrieb Karl Rieb karl.r...@gmail.com: Ben! I think I have an idea of exactly where the bug is! I did some more searching and discovered the difference that causes some tables to produce the wrong type and others to be okay: *the tables with the wrong type reverse the ordering of the timestamp column*. The bug is in org.apache.cassandra.transport.DataType:fromType(AbstractType): public static PairDataType, Object fromType(AbstractType type) { // For CQL3 clients, ReversedType is an implementation detail and they // shouldn't have to care about it. if (type instanceof ReversedType) type = ((ReversedType)type).baseType; // For compatibility sake, we still return DateType as the timestamp type in resultSet metadata (#5723) else if (type instanceof DateType) type = TimestampType.instance; DataType dt = dataTypeMap.get(type); if (dt == null) { if (type.isCollection()) { if (type instanceof ListType) { return Pair.DataType, Objectcreate(LIST, ((ListType)type).elements); } else if (type instanceof MapType) { MapType mt = (MapType)type; return Pair.DataType, Objectcreate(MAP, Arrays.asList(mt.keys, mt.values)); } else { assert type instanceof SetType; return Pair.DataType, Objectcreate(SET, ((SetType)type).elements); } } return Pair.DataType, Objectcreate(CUSTOM, type.toString()); } else { return Pair.create(dt, null); } } The issue is the else if, which does not check the base type of the reversed column: if (type instanceof ReversedType) type = ((ReversedType)type).baseType; // For compatibility sake, we still return DateType as the timestamp type in resultSet metadata (#5723) *else if* (type instanceof DateType) type = TimestampType.instance; The else should be removed to make it just: if (type instanceof ReversedType) type = ((ReversedType)type).baseType; // For compatibility sake, we still return DateType as the timestamp type in resultSet metadata (#5723) *if* (type instanceof DateType) type = TimestampType.instance; This way we do a check for DataType on the base type of reversed columns! I applied the fix to my 2.0.9 cassandra node and the errors go away! Could you guys please make this single-word fix? -Karl On Fri, Jul 18, 2014 at 1:30 PM, Ben Hood 0x6e6...@gmail.com wrote: On Fri, Jul 18, 2014 at 3:03 PM, Karl Rieb karl.r...@gmail.com wrote: Why is the protocol ID correct for some tables but not others? I have no idea. Why does it work when I do a clean install on a new 2.0.x cluster? I still have no idea. The bug seems to be on the Cassandra side and the clients seem to just be providing patches to these issues. It was reported to the Cassandra list, but there was no answer, potentially because the query was sent to the wrong list, but I don't really know. Maybe it should have gone into Jira, but it's unclear as to whether this is a client or a server issue. In any case, it didn't look like the server behavior was going to change any time soon, so we just took the pragmatic approach in gocql and worked around the issue. I will post to the Datastax java driver mailing list and see if they are willing to add a patch. That sounds like a good idea, seeing as the workaround has been tested before. Sorry to be of little help to you.
Re: DataType protocol ID error for TIMESTAMPs when upgrading from 1.2.11 to 2.0.9
On Fri, Jul 18, 2014 at 3:38 AM, Karl Rieb karl.r...@gmail.com wrote: Any suggestions on what is going on or how to fix it? I'm not sure how much this will help, but one of the gocql users reported similar symptoms when upgrading to 2.0.6. We ended up applying a client side patch to address the issue, the details are here: https://github.com/gocql/gocql/pull/154 That pull request also references the original bug report: https://github.com/gocql/gocql/issues/151 Not sure how helpful this will be though.
Re: DataType protocol ID error for TIMESTAMPs when upgrading from 1.2.11 to 2.0.9
Thanks Ben, I found that thread, but my concern is the inconsistency on the Cassandra side. Why is the protocol ID correct for some tables but not others? Why does it work when I do a clean install on a new 2.0.x cluster? The bug seems to be on the Cassandra side and the clients seem to just be providing patches to these issues. I will post to the Datastax java driver mailing list and see if they are willing to add a patch. -Karl On Jul 18, 2014, at 3:59 AM, Ben Hood 0x6e6...@gmail.com wrote: On Fri, Jul 18, 2014 at 3:38 AM, Karl Rieb karl.r...@gmail.com wrote: Any suggestions on what is going on or how to fix it? I'm not sure how much this will help, but one of the gocql users reported similar symptoms when upgrading to 2.0.6. We ended up applying a client side patch to address the issue, the details are here: https://github.com/gocql/gocql/pull/154 That pull request also references the original bug report: https://github.com/gocql/gocql/issues/151 Not sure how helpful this will be though.
Re: DataType protocol ID error for TIMESTAMPs when upgrading from 1.2.11 to 2.0.9
On Fri, Jul 18, 2014 at 3:03 PM, Karl Rieb karl.r...@gmail.com wrote: Why is the protocol ID correct for some tables but not others? I have no idea. Why does it work when I do a clean install on a new 2.0.x cluster? I still have no idea. The bug seems to be on the Cassandra side and the clients seem to just be providing patches to these issues. It was reported to the Cassandra list, but there was no answer, potentially because the query was sent to the wrong list, but I don't really know. Maybe it should have gone into Jira, but it's unclear as to whether this is a client or a server issue. In any case, it didn't look like the server behavior was going to change any time soon, so we just took the pragmatic approach in gocql and worked around the issue. I will post to the Datastax java driver mailing list and see if they are willing to add a patch. That sounds like a good idea, seeing as the workaround has been tested before. Sorry to be of little help to you.