[jira] [Commented] (CASSANDRA-8499) Ensure SSTableWriter cleans up properly after failure

2015-03-26 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382492#comment-14382492
 ] 

Benedict commented on CASSANDRA-8499:
-

Affected actions are: truncate, major compaction, cleanup, scrub, upgrade. 
Repair looks to be fine.

 Ensure SSTableWriter cleans up properly after failure
 -

 Key: CASSANDRA-8499
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8499
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Assignee: Benedict
 Fix For: 2.0.12, 2.1.3

 Attachments: 8499-20.txt, 8499-20v2, 8499-21.txt, 8499-21v2, 8499-21v3


 In 2.0 we do not free a bloom filter, in 2.1 we do not free a small piece of 
 offheap memory for writing compression metadata. In both we attempt to flush 
 the BF despite having encountered an exception, making the exception slow to 
 propagate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8499) Ensure SSTableWriter cleans up properly after failure

2015-03-26 Thread Nick Bailey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382090#comment-14382090
 ] 

Nick Bailey commented on CASSANDRA-8499:


So would this affect snapshot repairs? Potentially causing an eventual OOM 
after continually doing snapshot repairs on the cluster?

 Ensure SSTableWriter cleans up properly after failure
 -

 Key: CASSANDRA-8499
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8499
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Assignee: Benedict
 Fix For: 2.0.12, 2.1.3

 Attachments: 8499-20.txt, 8499-20v2, 8499-21.txt, 8499-21v2, 8499-21v3


 In 2.0 we do not free a bloom filter, in 2.1 we do not free a small piece of 
 offheap memory for writing compression metadata. In both we attempt to flush 
 the BF despite having encountered an exception, making the exception slow to 
 propagate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8499) Ensure SSTableWriter cleans up properly after failure

2015-01-06 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1427#comment-1427
 ] 

Marcus Eriksson commented on CASSANDRA-8499:


+1  (remove the unused boolean closed in SequentialWriter on commit)

 Ensure SSTableWriter cleans up properly after failure
 -

 Key: CASSANDRA-8499
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8499
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Assignee: Benedict
 Fix For: 2.0.12

 Attachments: 8499-20.txt, 8499-20v2, 8499-21.txt, 8499-21v2, 8499-21v3


 In 2.0 we do not free a bloom filter, in 2.1 we do not free a small piece of 
 offheap memory for writing compression metadata. In both we attempt to flush 
 the BF despite having encountered an exception, making the exception slow to 
 propagate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8499) Ensure SSTableWriter cleans up properly after failure

2015-01-06 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266284#comment-14266284
 ] 

Marcus Eriksson commented on CASSANDRA-8499:


2.1 v2 seems to double-close the files, first when we switch the writer, then 
when we call abort(), running SSTableRewriterTest.testNumberOfFiles_abort() 
outputs this: WARN  15:43:17 close(81) failed, errno (9).

 Ensure SSTableWriter cleans up properly after failure
 -

 Key: CASSANDRA-8499
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8499
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Assignee: Benedict
 Fix For: 2.0.12

 Attachments: 8499-20.txt, 8499-20v2, 8499-21.txt, 8499-21v2


 In 2.0 we do not free a bloom filter, in 2.1 we do not free a small piece of 
 offheap memory for writing compression metadata. In both we attempt to flush 
 the BF despite having encountered an exception, making the exception slow to 
 propagate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8499) Ensure SSTableWriter cleans up properly after failure

2015-01-05 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265086#comment-14265086
 ] 

Benedict commented on CASSANDRA-8499:
-

Good point. I've updated both versions to suppress warnings and ensure all 
abortion is completed regardless of any exception throwing. I think it makes 
most sense for both versions, since we only abort in the face of an error.

 Ensure SSTableWriter cleans up properly after failure
 -

 Key: CASSANDRA-8499
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8499
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Assignee: Benedict
 Fix For: 2.0.12

 Attachments: 8499-20.txt, 8499-20v2, 8499-21.txt, 8499-21v2


 In 2.0 we do not free a bloom filter, in 2.1 we do not free a small piece of 
 offheap memory for writing compression metadata. In both we attempt to flush 
 the BF despite having encountered an exception, making the exception slow to 
 propagate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8499) Ensure SSTableWriter cleans up properly after failure

2014-12-22 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14255596#comment-14255596
 ] 

Marcus Eriksson commented on CASSANDRA-8499:


In general LGTM, but at least for the 2.0 patch I think we should mimic current 
behavior as much as possible (ie, in SSTableWriter.abort(), we used 
closeQuietly which only logs an error message if we fail closing, now we throw 
an FSWriteError). 

Since we always* propagate the exception that caused abort() to be called, 
maybe it is better to always just log the exceptions in abort() and let the 
cause of abort() be thrown all the way out? (*we should propagate the cause in 
doAntiCompaction as well)

 Ensure SSTableWriter cleans up properly after failure
 -

 Key: CASSANDRA-8499
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8499
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Assignee: Benedict
 Fix For: 2.0.12

 Attachments: 8499-20.txt, 8499-21.txt


 In 2.0 we do not free a bloom filter, in 2.1 we do not free a small piece of 
 offheap memory for writing compression metadata. In both we attempt to flush 
 the BF despite having encountered an exception, making the exception slow to 
 propagate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8499) Ensure SSTableWriter cleans up properly after failure

2014-12-19 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14253683#comment-14253683
 ] 

Benedict commented on CASSANDRA-8499:
-

Everytime we fail to complete a flush or compaction we will leak the bloom 
filter data, so it is probably not _that_ uncommon. It's also a pretty small 
fix. But pretty agnostic about it, really, since clusters seem to have been 
surviving with it for years.

 Ensure SSTableWriter cleans up properly after failure
 -

 Key: CASSANDRA-8499
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8499
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Assignee: Benedict
 Fix For: 2.0.12

 Attachments: 8499-20.txt, 8499-21.txt


 In 2.0 we do not free a bloom filter, in 2.1 we do not free a small piece of 
 offheap memory for writing compression metadata. In both we attempt to flush 
 the BF despite having encountered an exception, making the exception slow to 
 propagate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)