[
https://issues.apache.org/jira/browse/CASSANDRA-2401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030425#comment-13030425
]
Jackson Chung commented on CASSANDRA-2401:
------------------------------------------
Here is 1 way that i could 100% reproduce the issue with data being null:
Need 2 nodes, 1 is gonna to autobootstrap to the other. Also assuming
completely clean start (blow up the /var/lib/cassandra/ or where ever data are
stored
i am also using brisk beta to test
to start:
node-A:
1) get brisk
2) start brisk with -t (jobtracker)
3) run a simple hive query :
3a) bin/brisk hive
3b) create table foo (bar INT);
3c) select count(*) from foo;
3d) exit;
4) every thing should be so far so good, let the brisk node continue to be up
node-B:
1) get brisk
2) modify the resources/cassandra/conf/cassandra.yaml:
2a) to enable autobootstrap.
2b) point seeds to node-A
3) put a sleep or break point in o.a.c.service.StorageService.joinTokenRing
method, right after "Map<InetAddress, Double> loadinfo =
StorageLoadBalancer.instance.getLoadInfo();" (personal preference: log a sleep
line, add a thread.sleep(a_long_time))
4) start brisk with -t on node-B
5) wait till the log line "Joining: getting bootstrap token" , it should now
reaches your break point (or zz)
6) crash the jvm (personal preference: kill -9 <pid>)
back to node-A
1) exit the jvm (BriskDaemon) "normally" (kill <pid>)
2) start the brisk node again (with -t):
log from node-A:
{noformat}
INFO 23:25:00,213 Logging initialized
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/home/riptano/work/brisk/resources/cassandra/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/home/riptano/work/brisk/resources/hadoop/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
INFO 23:25:00,235 Heap size: 510263296/511311872
INFO 23:25:00,237 JNA not found. Native methods will be disabled.
INFO 23:25:00,263 Loading settings from
file:/home/riptano/work/brisk/resources/cassandra/conf/cassandra.yaml
INFO 23:25:00,470 DiskAccessMode 'auto' determined to be mmap, indexAccessMode
is mmap
INFO 23:25:00,496 Detected Hadoop trackers are enabled, setting my DC to Brisk
INFO 23:25:00,696 Global memtable threshold is enabled at 162MB
INFO 23:25:00,846 Opening /var/lib/cassandra/data/system/IndexInfo-f-1
INFO 23:25:00,912 Opening /var/lib/cassandra/data/system/Schema-f-2
INFO 23:25:00,926 Opening /var/lib/cassandra/data/system/Schema-f-1
INFO 23:25:00,951 Opening /var/lib/cassandra/data/system/Migrations-f-2
INFO 23:25:00,954 Opening /var/lib/cassandra/data/system/Migrations-f-1
INFO 23:25:00,970 Opening /var/lib/cassandra/data/system/LocationInfo-f-2
INFO 23:25:00,989 Opening /var/lib/cassandra/data/system/LocationInfo-f-1
INFO 23:25:01,089 Loading schema version c4fd2440-7900-11e0-0000-ba846f9adcf7
INFO 23:25:01,499 Creating new commitlog segment
/var/lib/cassandra/commitlog/CommitLog-1304810701499.log
INFO 23:25:01,530 Replaying
/var/lib/cassandra/commitlog/CommitLog-1304810455288.log
INFO 23:25:01,675 Finished reading
/var/lib/cassandra/commitlog/CommitLog-1304810455288.log
INFO 23:25:01,730 Enqueuing flush of Memtable-MetaStore@102170028(869/1086
serialized/live bytes, 3 ops)
INFO 23:25:01,735 Writing Memtable-MetaStore@102170028(869/1086
serialized/live bytes, 3 ops)
INFO 23:25:01,743 Enqueuing flush of
Memtable-sblocks@1075051425(3044096/3805120 serialized/live bytes, 17 ops)
INFO 23:25:01,747 Enqueuing flush of Memtable-inode.path@780298059(2848/3560
serialized/live bytes, 59 ops)
INFO 23:25:01,748 Enqueuing flush of
Memtable-inode.sentinel@1934329031(2848/3560 serialized/live bytes, 59 ops)
INFO 23:25:01,748 Enqueuing flush of Memtable-inode@1660575731(6393/7991
serialized/live bytes, 134 ops)
INFO 23:25:01,821 Completed flushing
/var/lib/cassandra/data/HiveMetaStore/MetaStore-f-1-Data.db (989 bytes)
INFO 23:25:01,832 Writing Memtable-sblocks@1075051425(3044096/3805120
serialized/live bytes, 17 ops)
INFO 23:25:01,927 Completed flushing
/var/lib/cassandra/data/cfs/sblocks-f-1-Data.db (3045448 bytes)
INFO 23:25:01,928 Writing Memtable-inode.path@780298059(2848/3560
serialized/live bytes, 59 ops)
INFO 23:25:01,968 Completed flushing
/var/lib/cassandra/data/cfs/inode.path-f-1-Data.db (5346 bytes)
INFO 23:25:01,969 Writing Memtable-inode.sentinel@1934329031(2848/3560
serialized/live bytes, 59 ops)
INFO 23:25:02,035 Completed flushing
/var/lib/cassandra/data/cfs/inode.sentinel-f-1-Data.db (1735 bytes)
INFO 23:25:02,036 Writing Memtable-inode@1660575731(6393/7991 serialized/live
bytes, 134 ops)
INFO 23:25:02,085 Completed flushing
/var/lib/cassandra/data/cfs/inode-f-1-Data.db (8582 bytes)
INFO 23:25:02,087 Log replay complete
INFO 23:25:02,092 Cassandra version: 0.8.0-beta2-SNAPSHOT
INFO 23:25:02,092 Thrift API version: 19.10.0
INFO 23:25:02,092 Loading persisted ring state
INFO 23:25:02,092 load token size: 0
INFO 23:25:02,093 Starting up server gossip
INFO 23:25:02,104 Enqueuing flush of Memtable-LocationInfo@22262475(29/36
serialized/live bytes, 1 ops)
INFO 23:25:02,105 Writing Memtable-LocationInfo@22262475(29/36 serialized/live
bytes, 1 ops)
INFO 23:25:02,127 Completed flushing
/var/lib/cassandra/data/system/LocationInfo-f-3-Data.db (80 bytes)
INFO 23:25:02,149 Starting Messaging Service on port 7000
INFO 23:25:02,172 Using saved token 152036150612811635197207268153837644139
INFO 23:25:02,173 Enqueuing flush of Memtable-LocationInfo@1977026981(53/66
serialized/live bytes, 2 ops)
INFO 23:25:02,174 Writing Memtable-LocationInfo@1977026981(53/66
serialized/live bytes, 2 ops)
INFO 23:25:02,190 Completed flushing
/var/lib/cassandra/data/system/LocationInfo-f-4-Data.db (163 bytes)
INFO 23:25:02,193 Compacting Major:
[SSTableReader(path='/var/lib/cassandra/data/system/LocationInfo-f-2-Data.db'),
SSTableReader(path='/var/lib/cassandra/data/system/LocationInfo-f-1-Data.db'),
SSTableReader(path='/var/lib/cassandra/data/system/LocationInfo-f-3-Data.db'),
SSTableReader(path='/var/lib/cassandra/data/system/LocationInfo-f-4-Data.db')]
INFO 23:25:02,196 Will not load MX4J, mx4j-tools.jar is not in the classpath
INFO 23:25:02,196 Starting up Hadoop trackers
INFO 23:25:02,197 Waiting for gossip to start
INFO 23:25:02,225 Major@1830423861(system, LocationInfo, 438/741) now
compacting at 16777 bytes/ms.
INFO 23:25:02,257 Compacted to
/var/lib/cassandra/data/system/LocationInfo-tmp-f-5-Data.db. 741 to 447 (~60%
of original) bytes for 3 keys. Time: 64ms.
INFO 23:25:07,272 Chose seed 10.179.96.212 as jobtracker
WARN 23:25:09,331 Metrics system not started: Cannot locate configuration:
tried hadoop-metrics2-jobtracker.properties, hadoop-metrics2.properties
INFO 23:25:09,994 Chose seed 10.179.96.212 as jobtracker
INFO 23:25:10,139 Updating the current master key for generating delegation
tokens
INFO 23:25:10,143 Starting expired delegation token remover thread,
tokenRemoverScanInterval=60 min(s)
INFO 23:25:10,143 Scheduler configured with (memSizeForMapSlotOnJT,
memSizeForReduceSlotOnJT, limitMaxMemForMapTasks, limitMaxMemForReduceTasks)
(-1, -1, -1, -1)
INFO 23:25:10,144 Updating the current master key for generating delegation
tokens
INFO 23:25:10,145 Refreshing hosts (include/exclude) list
INFO 23:25:10,223 Starting jobtracker with owner as riptano
INFO 23:25:10,245 Starting SocketReader
INFO 23:25:10,374 Logging to
org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
INFO 23:25:10,623 Added global filtersafety
(class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
INFO 23:25:10,673 Port returned by webServer.getConnectors()[0].getLocalPort()
before open() is -1. Opening the listener on 50030
INFO 23:25:10,673 listener.getLocalPort() returned 50030
webServer.getConnectors()[0].getLocalPort() returned 50030
INFO 23:25:10,674 Jetty bound to port 50030
INFO 23:25:10,674 jetty-6.1.21
INFO 23:25:11,140 Started [email protected]:50030
INFO 23:25:11,147 JobTracker up at: 8012
INFO 23:25:11,147 JobTracker webserver: 50030
WARN 23:25:11,276 Incorrect permissions on
cassandra://localhost:9160/tmp/hadoop-riptano/mapred/system. Setting it to
rwx------
ERROR 23:25:11,321 Fatal exception in thread Thread[ReadStage:4,5,main]
java.lang.AssertionError: No data found for
NamesQueryFilter(columns=java.nio.HeapByteBuffer[pos=12 lim=16 cap=17]) in
DecoratedKey(55249227080490826413412398468829851220,
3165333533353736613164333836353061346636333465656437326131353939):QueryPath(columnFamilyName='inode',
superColumnName='null', columnName='null') (original filter
NamesQueryFilter(columns=java.nio.HeapByteBuffer[pos=12 lim=16 cap=17])) from
expression 'inode.73656e74696e656c EQ 78'
at
org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java:1513)
at
org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:46)
at
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
INFO 23:25:20,059 Deleted /var/lib/cassandra/data/system/LocationInfo-f-3
INFO 23:25:20,060 Deleted /var/lib/cassandra/data/system/LocationInfo-f-4
INFO 23:25:20,576 Deleted /var/lib/cassandra/data/system/LocationInfo-f-1
INFO 23:25:20,577 Deleted /var/lib/cassandra/data/system/LocationInfo-f-2
INFO 23:25:21,297 problem cleaning system directory:
cassandra://localhost:9160/tmp/hadoop-riptano/mapred/system
java.io.IOException: TimedOutException()
at
org.apache.cassandra.hadoop.fs.CassandraFileSystemThriftStore.listDeepSubPaths(CassandraFileSystemThriftStore.java:523)
at
org.apache.cassandra.hadoop.fs.CassandraFileSystemThriftStore.listSubPaths(CassandraFileSystemThriftStore.java:529)
at
org.apache.cassandra.hadoop.fs.CassandraFileSystem.listStatus(CassandraFileSystem.java:171)
at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:2374)
at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:2174)
at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:303)
at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:294)
at
org.apache.cassandra.hadoop.trackers.TrackerInitializer$1.run(TrackerInitializer.java:93)
at java.lang.Thread.run(Thread.java:662)
Caused by: TimedOutException()
at
org.apache.cassandra.thrift.CassandraServer.get_indexed_slices(CassandraServer.java:673)
at
org.apache.cassandra.hadoop.fs.CassandraFileSystemThriftStore.listDeepSubPaths(CassandraFileSystemThriftStore.java:506)
... 8 more
WARN 23:25:31,300 Incorrect permissions on
cassandra://localhost:9160/tmp/hadoop-riptano/mapred/system. Setting it to
rwx------
ERROR 23:25:31,315 Fatal exception in thread Thread[ReadStage:6,5,main]
java.lang.AssertionError: No data found for
NamesQueryFilter(columns=java.nio.HeapByteBuffer[pos=12 lim=16 cap=17]) in
DecoratedKey(55249227080490826413412398468829851220,
3165333533353736613164333836353061346636333465656437326131353939):QueryPath(columnFamilyName='inode',
superColumnName='null', columnName='null') (original filter
NamesQueryFilter(columns=java.nio.HeapByteBuffer[pos=12 lim=16 cap=17])) from
expression 'inode.73656e74696e656c EQ 78'
at
org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java:1513)
at
org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:46)
at
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
INFO 23:25:41,303 problem cleaning system directory:
cassandra://localhost:9160/tmp/hadoop-riptano/mapred/system
java.io.IOException: TimedOutException()
at
org.apache.cassandra.hadoop.fs.CassandraFileSystemThriftStore.listDeepSubPaths(CassandraFileSystemThriftStore.java:523)
at
org.apache.cassandra.hadoop.fs.CassandraFileSystemThriftStore.listSubPaths(CassandraFileSystemThriftStore.java:529)
at
org.apache.cassandra.hadoop.fs.CassandraFileSystem.listStatus(CassandraFileSystem.java:171)
at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:2374)
at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:2174)
at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:303)
at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:294)
at
org.apache.cassandra.hadoop.trackers.TrackerInitializer$1.run(TrackerInitializer.java:93)
at java.lang.Thread.run(Thread.java:662)
Caused by: TimedOutException()
at
org.apache.cassandra.thrift.CassandraServer.get_indexed_slices(CassandraServer.java:673)
at
org.apache.cassandra.hadoop.fs.CassandraFileSystemThriftStore.listDeepSubPaths(CassandraFileSystemThriftStore.java:506)
... 8 more
WARN 23:25:51,308 Incorrect permissions on
cassandra://localhost:9160/tmp/hadoop-riptano/mapred/system. Setting it to
rwx------
ERROR 23:25:51,321 Fatal exception in thread Thread[ReadStage:8,5,main]
java.lang.AssertionError: No data found for
NamesQueryFilter(columns=java.nio.HeapByteBuffer[pos=12 lim=16 cap=17]) in
DecoratedKey(55249227080490826413412398468829851220,
3165333533353736613164333836353061346636333465656437326131353939):QueryPath(columnFamilyName='inode',
superColumnName='null', columnName='null') (original filter
NamesQueryFilter(columns=java.nio.HeapByteBuffer[pos=12 lim=16 cap=17])) from
expression 'inode.73656e74696e656c EQ 78'
at
org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java:1513)
at
org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:46)
at
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
{noformat}
> getColumnFamily() return null, which is not checked in ColumnFamilyStore.java
> scan() method, causing Timeout Exception in query
> -------------------------------------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-2401
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2401
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.7.0
> Environment: Hector 0.7.0-28, Cassandra 0.7.4, Windows 7, Eclipse
> Reporter: Tey Kar Shiang
> Assignee: Jonathan Ellis
> Fix For: 0.7.6
>
> Attachments: 2401-v2.txt, 2401-v3.txt, 2401.txt
>
>
> ColumnFamilyStore.java, line near 1680, "ColumnFamily data =
> getColumnFamily(new QueryFilter(dk, path, firstFilter))", the data is
> returned null, causing NULL exception in "satisfies(data, clause, primary)"
> which is not captured. The callback got timeout and return a Timeout
> exception to Hector.
> The data is empty, as I traced, I have the the columns Count as 0 in
> removeDeletedCF(), which return the null there. (I am new and trying to
> understand the logics around still). Instead of crash to NULL, could we
> bypass the data?
> About my test:
> A stress-test program to add, modify and delete data to keyspace. I have 30
> threads simulate concurrent users to perform the actions above, and do a
> query to all rows periodically. I have Column Family with rows (as File) and
> columns as index (e.g. userID, fileType).
> No issue on the first day of test, and stopped for 3 days. I restart the test
> on 4th day, 1 of the users failed to query the files (timeout exception
> received). Most of the users are still okay with the query.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira