[jira] [Commented] (CASSANDRA-6904) commitlog segments may not be archived after restart

2014-09-29 Thread Charles Cao (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14152144#comment-14152144
 ] 

Charles Cao commented on CASSANDRA-6904:


Is there anybody reviewing the patch now?

> commitlog segments may not be archived after restart
> 
>
> Key: CASSANDRA-6904
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6904
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Sam Tunnicliffe
> Fix For: 2.0.11, 2.1.1
>
> Attachments: 2.0-6904.txt, 2.1-6904-v2.txt, 2.1-6904.txt
>
>
> commitlog segments are archived when they are full, so the current active 
> segment will not be archived on restart (and its contents will not be 
> available for pitr).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6904) commitlog segments may not be archived after restart

2014-09-22 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14143552#comment-14143552
 ] 

Jonathan Ellis commented on CASSANDRA-6904:
---

+1 separate ticket

> commitlog segments may not be archived after restart
> 
>
> Key: CASSANDRA-6904
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6904
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Sam Tunnicliffe
> Fix For: 2.0.11, 2.1.1
>
> Attachments: 2.0-6904.txt, 2.1-6904.txt
>
>
> commitlog segments are archived when they are full, so the current active 
> segment will not be archived on restart (and its contents will not be 
> available for pitr).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6904) commitlog segments may not be archived after restart

2014-09-22 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14143076#comment-14143076
 ] 

Benedict commented on CASSANDRA-6904:
-

bq. if that target file already exists we throw an exception and bail on startup

Good point. Given we now _expect_ this situation, though, we should probably 
change that behaviour. But agreed, the naming of the files during restore 
should have already solved the problem I was highlighting.

bq. On the native archive, I'd rather we handle any general rework of the 
archival process as a separate ticket if nobody objects.

WFM. But "rework" is much too strong a word IMO - we're going to have to 
support the current archival option indefinitely, most likely (and really it's 
low cost to do so, code is tiny) - we just want to give people a saner option 
to move to. Which pretty much means offering a yaml option and making the 
archive command a function:CLS->Runnable, with the runnable doing a native copy 
(and perhaps supporting hard linking, which is slightly trickier, but still not 
hard and not a requisite) as well as the current option. So it's small enough 
I'd also feel very comfortable including it here.

> commitlog segments may not be archived after restart
> 
>
> Key: CASSANDRA-6904
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6904
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Sam Tunnicliffe
> Fix For: 2.0.11, 2.1.1
>
> Attachments: 2.0-6904.txt, 2.1-6904.txt
>
>
> commitlog segments are archived when they are full, so the current active 
> segment will not be archived on restart (and its contents will not be 
> available for pitr).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6904) commitlog segments may not be archived after restart

2014-09-22 Thread Sam Tunnicliffe (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14143071#comment-14143071
 ] 

Sam Tunnicliffe commented on CASSANDRA-6904:


The reason I didn't add header checking during replay is I don't think it's 
actually possible in 2.1.

CLA.maybeRestoreArchive uses the header of the archive file to create the 
destination file for restore and if that target file already exists we throw an 
exception and bail on startup. The external restore script can in theory modify 
the destination filename but only to something which conforms to the defined 
pattern, otherwise the files aren't picked up during replay. Because of this 
constraint, the only thing the restore script can really do is modify the part 
of the filename that maps to the segment id. However, if it does do that, the 
file will be skipped anyway as CLR.recover(File) creates its descriptor from 
the (modified) filename, meaning that when it actually replays it, checksums 
fail due to the mismatched descriptor id.

That said, it's pretty trivial to add a check for duplicate segments during 
recovery, so if I've missed something let me know & I'll attach an updated 
patch.

On the native archive, I'd rather we handle any general rework of the archival 
process as a separate ticket if nobody objects.

> commitlog segments may not be archived after restart
> 
>
> Key: CASSANDRA-6904
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6904
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Sam Tunnicliffe
> Fix For: 2.0.11, 2.1.1
>
> Attachments: 2.0-6904.txt, 2.1-6904.txt
>
>
> commitlog segments are archived when they are full, so the current active 
> segment will not be archived on restart (and its contents will not be 
> available for pitr).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6904) commitlog segments may not be archived after restart

2014-09-22 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14142983#comment-14142983
 ] 

Benedict commented on CASSANDRA-6904:
-

In 2.1 we should be using the header information to ensure we only replay each 
segment once.

I also think this is a good opportunity in 2.1 to drop in support for native 
archival, with which we could easily avoid backing up twice. The current 
situation feels a little clunky to me, since if we have a bug or other problem 
causing startup to crash we might repeatedly fill up the archive disk without 
the operator realising. That could be a follow up ticket, I'm fine either way, 
but it's really not a very challenging feature to insert and helps make this 
much more sane.

> commitlog segments may not be archived after restart
> 
>
> Key: CASSANDRA-6904
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6904
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Sam Tunnicliffe
> Fix For: 2.0.11, 2.1.1
>
> Attachments: 2.0-6904.txt, 2.1-6904.txt
>
>
> commitlog segments are archived when they are full, so the current active 
> segment will not be archived on restart (and its contents will not be 
> available for pitr).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6904) commitlog segments may not be archived after restart

2014-09-18 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139527#comment-14139527
 ] 

Jonathan Ellis commented on CASSANDRA-6904:
---

"Archive everything on restart" is definitely a simpler solution.

> commitlog segments may not be archived after restart
> 
>
> Key: CASSANDRA-6904
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6904
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Sam Tunnicliffe
> Fix For: 2.1.1
>
>
> commitlog segments are archived when they are full, so the current active 
> segment will not be archived on restart (and its contents will not be 
> available for pitr).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6904) commitlog segments may not be archived after restart

2014-09-17 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138608#comment-14138608
 ] 

Benedict commented on CASSANDRA-6904:
-

As discussed on CASSANDRA-7965, we could simply archive all commit logs present 
on restart. This runs the risk of repeatedly archiving commit logs if we fail 
repeatedly on restart for some reason, so may be undesirable. However we can 
avoid replaying the same CLS twice in 2.1 since we retain the original id in 
the header.

> commitlog segments may not be archived after restart
> 
>
> Key: CASSANDRA-6904
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6904
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jonathan Ellis
> Fix For: 2.1.1
>
>
> commitlog segments are archived when they are full, so the current active 
> segment will not be archived on restart (and its contents will not be 
> available for pitr).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6904) commitlog segments may not be archived after restart

2014-03-21 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13943887#comment-13943887
 ] 

Vijay commented on CASSANDRA-6904:
--

Hi Jonathan, We can also encode it in the header, but either ways SGTM.

> commitlog segments may not be archived after restart
> 
>
> Key: CASSANDRA-6904
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6904
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jonathan Ellis
> Fix For: 2.0.7
>
>
> commitlog segments are archived when they are full, so the current active 
> segment will not be archived on restart (and its contents will not be 
> available for pitr).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6904) commitlog segments may not be archived after restart

2014-03-21 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13943343#comment-13943343
 ] 

Benedict commented on CASSANDRA-6904:
-

To clarify, I think the absolute safest thing to do is to hard-link as soon as 
a CL is swapped into active, and then only recycle after an archive operation 
is successfully run on the hard-linked version. This should make sure we take 
care of not-yet-finished segments (and hence not-yet-archived) on startup as 
well, if we push all recycles through this path.

> commitlog segments may not be archived after restart
> 
>
> Key: CASSANDRA-6904
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6904
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jonathan Ellis
> Fix For: 2.0.7
>
>
> commitlog segments are archived when they are full, so the current active 
> segment will not be archived on restart (and its contents will not be 
> available for pitr).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6904) commitlog segments may not be archived after restart

2014-03-21 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13943336#comment-13943336
 ] 

Jonathan Ellis commented on CASSANDRA-6904:
---

[~benedict] suggests that we could use hard links to track 
segments-pending-archive.  Since we don't recycle segments until archive is 
complete this should be fine.

> commitlog segments may not be archived after restart
> 
>
> Key: CASSANDRA-6904
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6904
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jonathan Ellis
> Fix For: 2.0.7
>
>
> commitlog segments are archived when they are full, so the current active 
> segment will not be archived on restart (and its contents will not be 
> available for pitr).



--
This message was sent by Atlassian JIRA
(v6.2#6252)