[
https://issues.apache.org/jira/browse/CASSANDRA-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13154387#comment-13154387
]
Brandon Williams commented on CASSANDRA-3045:
---------------------------------------------
Specifically, on line 97 FileStreamTask has a dangling semicolon that causes
the 'if' to always fire. However, removing that still causes a secondary index
check to fail in StreamingTransferTest. Here is a debug log of a working test:
{noformat}
INFO [Thread-3] 2011-11-18 17:36:04,039 SecondaryIndexManager.java (line 115)
Submitting index build of 626972746864617465, for data in
SSTableReader(path='build/test/cassandra/data/Keyspace1/Indexed1-hb-4-Data.db')
DEBUG [CompactionExecutor:1] 2011-11-18 17:36:04,041 Table.java (line 515)
Indexing row key1
DEBUG [CompactionExecutor:1] 2011-11-18 17:36:04,042 CollationController.java
(line 76) collectTimeOrderedData
DEBUG [CompactionExecutor:1] 2011-11-18 17:36:04,042 KeysIndex.java (line 100)
applying index row 3288498 in ColumnFamily(Indexed1.626972746864617465
[6b657931:false:0@1234,])
DEBUG [CompactionExecutor:1] 2011-11-18 17:36:04,043 SlabAllocator.java (line
105) 1 regions now allocated in org.apache.cassandra.utils.SlabAllocator@1716fa0
DEBUG [CompactionExecutor:1] 2011-11-18 17:36:04,043 Table.java (line 515)
Indexing row key3
DEBUG [CompactionExecutor:1] 2011-11-18 17:36:04,043 CollationController.java
(line 76) collectTimeOrderedData
DEBUG [CompactionExecutor:1] 2011-11-18 17:36:04,044 KeysIndex.java (line 100)
applying index row 3288500 in ColumnFamily(Indexed1.626972746864617465
[6b657933:false:0@1234,])
DEBUG [Thread-3] 2011-11-18 17:36:04,044 ColumnFamilyStore.java (line 671)
flush position is ReplayPosition(segmentId=1321659363183, position=354)
INFO [Thread-3] 2011-11-18 17:36:04,045 ColumnFamilyStore.java (line 685)
Enqueuing flush of Memtable-Indexed1.626972746864617465@5734522(38/47
serialized/live bytes, 2 ops)
INFO [FlushWriter:1] 2011-11-18 17:36:04,045 Memtable.java (line 239) Writing
Memtable-Indexed1.626972746864617465@5734522(38/47 serialized/live bytes, 2 ops)
DEBUG [FlushWriter:1] 2011-11-18 17:36:04,047 DatabaseDescriptor.java (line
783) expected data files size is 84; largest free partition has 19441123328
bytes free
INFO [FlushWriter:1] 2011-11-18 17:36:04,062 Memtable.java (line 275)
Completed flushing
build/test/cassandra/data/Keyspace1/Indexed1.626972746864617465-hb-2-Data.db
(154 bytes)
DEBUG [FlushWriter:1] 2011-11-18 17:36:04,063 IntervalNode.java (line 45)
Creating IntervalNode from [Interval(DecoratedKey(3288498, 0000000000322db2),
DecoratedKey(3288500, 0000000000322db4))]
DEBUG [FlushWriter:1] 2011-11-18 17:36:04,063 DataTracker.java (line 331)
adding build/test/cassandra/data/Keyspace1/Indexed1.626972746864617465-hb-2 to
list of files tracked for Keyspace1.Indexed1.626972746864617465
DEBUG [COMMIT-LOG-WRITER] 2011-11-18 17:36:04,064 CommitLog.java (line 459)
discard completed log segments for ReplayPosition(segmentId=1321659363183,
position=354), column family 1047.
DEBUG [COMMIT-LOG-WRITER] 2011-11-18 17:36:04,065 CommitLog.java (line 498) Not
safe to delete commit log
CommitLogSegment(/srv/encrypted/project/cassandra/build/test/cassandra/commitlog/CommitLog-1321659363183.log);
dirty is ; hasNext: false
INFO [Thread-3] 2011-11-18 17:36:04,065 SecondaryIndexManager.java (line 134)
Index build of 626972746864617465, complete
INFO [Thread-3] 2011-11-18 17:36:04,066 StreamInSession.java (line 162)
Finished streaming session 778312411854932 from /127.0.0.1
DEBUG [MiscStage:1] 2011-11-18 17:36:04,066 StreamReplyVerbHandler.java (line
47) Received StreamReply StreamReply(sessionId=778312411854932, file='',
action=SESSION_FINISHED)
{noformat}
and here is a failing one (with 3045 applied):
{noformat}
INFO [Thread-3] 2011-11-18 17:20:02,669 SecondaryIndexManager.java (line 117)
Submitting index build of 626972746864617465, for data in
SSTableReader(path='build/test/cassandra/data/Keyspace1/Indexed1-h-4-Data.db')
DEBUG [Streaming:1] 2011-11-18 17:20:02,669 MmappedSegmentedFile.java (line
139) All segments have been unmapped successfully
DEBUG [NonPeriodicTasks:1] 2011-11-18 17:20:02,671 FileUtils.java (line 51)
Deleting Indexed1-h-2-Statistics.db
DEBUG [NonPeriodicTasks:1] 2011-11-18 17:20:02,671 FileUtils.java (line 51)
Deleting Indexed1-h-2-Filter.db
DEBUG [NonPeriodicTasks:1] 2011-11-18 17:20:02,671 FileUtils.java (line 51)
Deleting Indexed1-h-2-Index.db
DEBUG [NonPeriodicTasks:1] 2011-11-18 17:20:02,672 SSTable.java (line 143)
Deleted build/test/cassandra/data/Keyspace1/Indexed1-h-2
DEBUG [CompactionExecutor:1] 2011-11-18 17:20:02,674 Table.java (line 516)
Indexing row key1
DEBUG [CompactionExecutor:1] 2011-11-18 17:20:02,674 CollationController.java
(line 74) collectTimeOrderedData
DEBUG [CompactionExecutor:1] 2011-11-18 17:20:02,675 KeysIndex.java (line 100)
applying index row 3288498 in ColumnFamily(Indexed1.626972746864617465
[6b657931:false:0@1234,])
DEBUG [CompactionExecutor:1] 2011-11-18 17:20:02,676 SlabAllocator.java (line
105) 1 regions now allocated in org.apache.cassandra.utils.SlabAllocator@1148603
DEBUG [CompactionExecutor:1] 2011-11-18 17:20:02,676 Table.java (line 516)
Indexing row key3
DEBUG [CompactionExecutor:1] 2011-11-18 17:20:02,676 CollationController.java
(line 74) collectTimeOrderedData
DEBUG [CompactionExecutor:1] 2011-11-18 17:20:02,677 KeysIndex.java (line 100)
applying index row 3288500 in ColumnFamily(Indexed1.626972746864617465
[6b657933:false:0@1234,])
DEBUG [Thread-3] 2011-11-18 17:20:02,677 ColumnFamilyStore.java (line 668)
flush position is ReplayPosition(segmentId=1321658401840, position=354)
INFO [Thread-3] 2011-11-18 17:20:02,678 ColumnFamilyStore.java (line 682)
Enqueuing flush of Memtable-Indexed1.626972746864617465@6972371(38/47
serialized/live bytes, 2 ops)
INFO [FlushWriter:1] 2011-11-18 17:20:02,679 Memtable.java (line 237) Writing
Memtable-Indexed1.626972746864617465@6972371(38/47 serialized/live bytes, 2 ops)
DEBUG [FlushWriter:1] 2011-11-18 17:20:02,679 DatabaseDescriptor.java (line
791) expected data files size is 84; largest free partition has 19418710016
bytes free
INFO [FlushWriter:1] 2011-11-18 17:20:02,690 Memtable.java (line 273)
Completed flushing
build/test/cassandra/data/Keyspace1/Indexed1.626972746864617465-h-2-Data.db
(154 bytes)
DEBUG [FlushWriter:1] 2011-11-18 17:20:02,690 IntervalNode.java (line 45)
Creating IntervalNode from [Interval(DecoratedKey(3288498, 0000000000322db2),
DecoratedKey(3288500, 0000000000322db4))]
DEBUG [FlushWriter:1] 2011-11-18 17:20:02,691 IntervalNode.java (line 45)
Creating IntervalNode from []
DEBUG [COMMIT-LOG-WRITER] 2011-11-18 17:20:02,691 CommitLog.java (line 458)
discard completed log segments for ReplayPosition(segmentId=1321658401840,
position=354), column family 1046.
DEBUG [COMMIT-LOG-WRITER] 2011-11-18 17:20:02,691 CommitLog.java (line 497) Not
safe to delete commit log
CommitLogSegment(/srv/encrypted/project/cassandra/build/test/cassandra/commitlog/CommitLog-1321658401840.log);
dirty is ; hasNext: false
INFO [Thread-3] 2011-11-18 17:20:02,691 SecondaryIndexManager.java (line 136)
Index build of 626972746864617465, complete
INFO [Thread-3] 2011-11-18 17:20:02,692 StreamInSession.java (line 179)
Finished streaming session 777351072779992 from /127.0.0.1
DEBUG [Streaming:1] 2011-11-18 17:20:02,692 StreamReplyVerbHandler.java (line
47) Received StreamReply StreamReply(sessionId=777351072779992, file='',
action=SESSION_FINISHED)
{noformat}
It looks like something is up with the "Creating IntervalNode from []" line,
but I don't see how streaming could break that, especially since the other
debug lines indicate the correct data was sent.
> Update ColumnFamilyOutputFormat to use new bulkload API
> -------------------------------------------------------
>
> Key: CASSANDRA-3045
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3045
> Project: Cassandra
> Issue Type: Improvement
> Components: Hadoop
> Reporter: Jonathan Ellis
> Assignee: Brandon Williams
> Priority: Minor
> Fix For: 1.1
>
> Attachments: 0001-Remove-gossip-SS-requirement-from-BulkLoader.txt,
> 0002-Allow-DD-loading-without-yaml.txt,
> 0003-hadoop-output-support-for-bulk-loading.txt
>
>
> The bulk loading interface added in CASSANDRA-1278 is a great fit for Hadoop
> jobs.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira