[
https://issues.apache.org/jira/browse/CASSANDRA-11656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15259049#comment-15259049
]
Wei Deng commented on CASSANDRA-11656:
--------------------------------------
I tested out the patch [~cnlwsu] provided in CASSANDRA-11655. However, I still
see some discrepancies like the following:
{noformat}
~/cassandra-trunk/tools/bin/sstabledump ma-15-big-Data.db
[
{
"partition" : {
"key" : [ "1" ],
"position" : 0
},
"rows" : [
{
"type" : "row",
"position" : 18,
"clustering" : [ "c1" ],
"liveness_info" : { "tstamp" : 1461646542601774 },
"cells" : [
{ "name" : "val0_int", "deletion_info" : { "tstamp" : 1461649343 },
"tstamp" : 1461649343000508
},
{ "name" : "val1_set_of_int", "deletion_info" : { "deletion_time" :
1461647295880443, "tstamp" : 1461647295 } },
{ "name" : "val1_set_of_int", "path" : [ "1" ], "deletion_info" : {
"tstamp" : 1461647320 },
"tstamp" : 1461647320160261
},
{ "name" : "val1_set_of_int", "path" : [ "10" ], "value" : "",
"tstamp" : 1461647295880444 },
{ "name" : "val1_set_of_int", "path" : [ "11" ], "value" : "",
"tstamp" : 1461647295880444 },
{ "name" : "val1_set_of_int", "path" : [ "12" ], "value" : "",
"tstamp" : 1461647295880444 }
]
},
{
"type" : "row",
"position" : 86,
"clustering" : [ "c2" ],
"deletion_info" : { "deletion_time" : 1461647588089843, "tstamp" :
1461647588 },
"cells" : [ ]
},
{
"type" : "row",
"position" : 101,
"clustering" : [ "c4" ],
"liveness_info" : { "tstamp" : 1461649635932899 },
"cells" : [ ]
},
{
"type" : "row",
"position" : 114,
"clustering" : [ "c5" ],
"liveness_info" : { "tstamp" : 1461650266651050, "ttl" : 60,
"expires_at" : 1461650326, "expired" : true },
"cells" : [
{ "name" : "val0_int", "value" : "500", "tstamp" : 1461650241403672 },
{ "name" : "val1_set_of_int", "deletion_info" : { "deletion_time" :
1461650241403671, "tstamp" : 1461650241 } },
{ "name" : "val1_set_of_int", "path" : [ "111" ], "value" : "",
"tstamp" : 1461650241403672 },
{ "name" : "val1_set_of_int", "path" : [ "222" ], "value" : "",
"tstamp" : 1461650241403672 },
{ "name" : "val1_set_of_int", "path" : [ "333" ], "value" : "",
"tstamp" : 1461650241403672 }
]
},
{
"type" : "row",
"position" : 180,
"clustering" : [ "c6" ],
"deletion_info" : { "deletion_time" : 1461708091029189, "tstamp" :
1461708091 },
"cells" : [ ]
}
]
}
]
{noformat}
IMHO if we decide to use tstamp to represent timestamp of the writes (whether
it's a delete or a regular mutation), then it should always be microseconds
since epoch (16 digits), and it should be consistent across regular cells and
tombstones.
In my view, the "deletion_time" can be a good short name for localDeletionTime
(which only guides compaction to do GC) and as long as we are consistent across
the board and always use that to represent localDeletionTime that has only 10
digits (seconds since epoch), it's good to me too.
> sstabledump has inconsistency in deletion_time printout
> -------------------------------------------------------
>
> Key: CASSANDRA-11656
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11656
> Project: Cassandra
> Issue Type: Bug
> Components: Tools
> Reporter: Wei Deng
> Labels: Tools
>
> See the following output (note the deletion info under the second row):
> {noformat}
> [
> {
> "partition" : {
> "key" : [ "1" ],
> "position" : 0
> },
> "rows" : [
> {
> "type" : "row",
> "position" : 18,
> "clustering" : [ "c1" ],
> "liveness_info" : { "tstamp" : 1461646542601774 },
> "cells" : [
> { "name" : "val0_int", "deletion_time" : 1461647421, "tstamp" :
> 1461647421344759 },
> { "name" : "val1_set_of_int", "path" : [ "1" ], "deletion_time" :
> 1461647320, "tstamp" : 1461647320160261 },
> { "name" : "val1_set_of_int", "path" : [ "10" ], "value" : "",
> "tstamp" : 1461647295880444 },
> { "name" : "val1_set_of_int", "path" : [ "11" ], "value" : "",
> "tstamp" : 1461647295880444 },
> { "name" : "val1_set_of_int", "path" : [ "12" ], "value" : "",
> "tstamp" : 1461647295880444 }
> ]
> },
> {
> "type" : "row",
> "position" : 85,
> "clustering" : [ "c2" ],
> "deletion_info" : { "deletion_time" : 1461647588089843, "tstamp" :
> 1461647588 },
> "cells" : [ ]
> }
> ]
> }
> ]
> {noformat}
> To avoid confusion, we need to have consistency in printing out the
> DeletionTime object. By definition, markedForDeleteAt is in microseconds
> since epoch and marks the time when the "delete" mutation happens;
> localDeletionTime is in seconds since epoch and allows GC to collect the
> tombstone if the current epoch second is greater than localDeletionTime +
> gc_grace_seconds. I'm ok to use "tstamp" to represent markedForDeleteAt
> because markedForDeleteAt does represent this delete mutation's timestamp,
> but we need to be consistent everywhere.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)