Re: Issues getting JNA to work correctly under centos 5.5 using cassandra 0.7.0-rc1 and JNA 2.7.3
I checked and /etc/security/limits.conf on redhat supports zero (0) to mean unlimited. Here is the sample from the man page. Notice the soft core entry. EXAMPLES These are some example lines which might be specified in /etc/security/limits.conf. * softcore0 * hardrss 1 @studenthardnproc 20 @facultysoftnproc 20 @facultyhardnproc 50 ftp hardnproc 0 @student- maxlogins 4 On Mon, Nov 29, 2010 at 6:51 AM, Jason Pell jasonmp...@gmail.com wrote: Ok that's a good point i will check - I am not sure. Sent from my iPhone On Nov 29, 2010, at 5:53, Tyler Hobbs ty...@riptano.com wrote: I'm not familiar with ulimit on RedHat systems, but are you sure you have ulimit set correctly? Did you set it to '0' or 'unlimited'? I ask because on a Debian system, I get this: tho...@~ $ ulimit -l unlimited Where you said that you got back '0'. - Tyler On Sun, Nov 28, 2010 at 1:15 AM, Jason Pell ja...@pellcorp.com wrote: Hi, I have selinux disabled via /etc/sysconfig/selinux already. But I did as you suggested anyway, even restarted the whole machine again too and still no difference. Do you know if there is a way to discover exactly what this error means? THanks Jason On Sat, Nov 27, 2010 at 3:59 AM, Nate McCall n...@riptano.com wrote: This might be an issue with selinux. You can try this quickly to temporarily disable selinux enforcement: /usr/sbin/setenforce 0 (as root) and then start cassandra as your user. On Fri, Nov 26, 2010 at 1:00 AM, Jason Pell jasonmp...@gmail.com wrote: I restarted the box :-) so it's well and truly set Sent from my iPhone On Nov 26, 2010, at 17:57, Brandon Williams dri...@gmail.com wrote: On Thu, Nov 25, 2010 at 10:02 PM, Jason Pell ja...@pellcorp.com wrote: Hi, I have set the memlock limit to unlimited in /etc/security/limits.conf [devel...@localhost apache-cassandra-0.7.0-rc1]$ ulimit -l 0 Running as a non root user gets me a Unknown mlockall error 1 Have you tried logging out and back in after changing limits.conf? -Brandon
Booting Cassandra v0.7.0 on Windows: rename failed
Hi, Recently I downloaded Cassandra v0.7.0 rc1. When I try to run cassandra it ends with the following logging: INFO 09:17:30,044 Enqueuing flush of memtable-locationi...@839514767(643 bytes, 12 operations) INFO 09:17:30,045 Writing memtable-locationi...@839514767(643 bytes, 12 operations) ERROR 09:17:30,233 Fatal exception in thread Thread[FlushWriter:1,5,main] java.io.IOError: java.io.IOException: rename failed of d:\cassandra\data\system\LocationInfo-e-1-Data.db at org.apache.cassandra.io.sstable.SSTableWriter.rename(SSTableWriter.java: 214) at org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTable Writer.java:184) at org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTable Writer.java:167) at org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:161) at org.apache.cassandra.db.Memtable.access$000(Memtable.java:49) at org.apache.cassandra.db.Memtable$1.runMayThrow(Memtable.java:174) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto r.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja va:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.io.IOException: rename failed of d:\cassandra\data\system\LocationInfo-e-1-Data.db at org.apache.cassandra.utils.FBUtilities.renameWithConfirm(FBUtilities.jav a:359) at org.apache.cassandra.io.sstable.SSTableWriter.rename(SSTableWriter.java: 210) ... 12 more Operating system is Windows 7. Tried it also on Windows 2003 server. I only modified a few (necessary) path settings in cassandra.yaml: commitlog_directory: d:/cassandra/commitlog data_file_directories: - d:/cassandra/data saved_caches_directory: d:/cassandra/saved_caches Does anybody know what I'm doing wrong? Regards, Ramon
word_count example fails in multi-node configuration
Hi guys, I am trying to run word_count example from contrib directory (0.7 beta 3 and 0.7.0 rc 1). It works fine in a single-node configuration, but fails with 2+ nodes. It fails in the assert statement, which caused problems before (https://issues.apache.org/jira/browse/CASSANDRA-1700). Here's a simple ring I have and error messages. --- Address Status State LoadOwnsToken 143797990709940316224804537595633718982 127.0.0.2 Up Normal 40.2 KB 51.38% 61078635599166706937511052402724559481 127.0.0.1 Up Normal 36.01 KB48.62% 143797990709940316224804537595633718982 --- [SERVER SIDE] ERROR 17:39:57,098 Fatal exception in thread Thread[ReadStage:4,5,main] java.lang.AssertionError: (143797990709940316224804537595633718982,61078635599166706937511052402724559481] at org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1273) at org.apache.cassandra.service.RangeSliceVerbHandler.doVerb(RangeSliceVerbHandler.java:48) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) --- [CLIENT_SIDE] java.lang.RuntimeException: org.apache.thrift.TApplicationException: Internal error processing get_range_slices at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:277) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:292) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:189) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:148) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) Caused by: org.apache.thrift.TApplicationException: Internal error processing get_range_slices at org.apache.thrift.TApplicationException.read(TApplicationException.java:108) at org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:724) at org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:704) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:255) ... 11 more --- Looks like tokens used in ColumnFamilySplits (ColumnFamilyInputFormat.java) are on wrapping ranges (left_token right_token). Any ideas how to fix this? -- Regards, Roman
Cassandra 0.7 beta 3 outOfMemory (OOM)
Hi community, during my tests i had several OOM crashes. Getting some hints to find the problem would be nice. First cassandra crashes after about 45 min insert test script. During the following tests time to OOM was shorter until it started to crash even in idle mode. Here the facts: - cassandra 0.7 beta 3 - using lucandra to index about 3 million files ~1kb data - inserting with one client to one cassandra node with about 200 files/s - cassandra data files for this keyspace grow up to about 20 GB - the keyspace only contains the two lucandra specific CFs Cluster: - cassandra single node on windows 32bit, Xeon 2,5 Ghz, 4GB RAM - java jre 1.6.0_22 - heap space first 1GB, later increased to 1,3 GB Cassandra.yaml: default + reduced binary_memtable_throughput_in_mb to 128 CFs: default + reduced min_compaction_threshold: 4 max_compaction_threshold: 8 I think the problem appears always during compaction, and perhaps it is a result of large rows (some about 170mb). Are there more optionions we could use to work with few memory? Is it a problem of compaction? And how to avoid? Slower inserts? More memory? Even fewer memtable_throuput or in_memory_compaction_limit? Continuous manual major comapction? I've read http://www.riptano.com/docs/0.6/troubleshooting/index#nodes-are-dying-with-oom-errors - row_size should be fixed since 0.7 and 200mb is still far away from 2gb - only key cache is used a little bit 3600/2 - after a lot of writes cassandra crashes even in idle mode - memtablesize was reduced and there are only 2 CFs Several heapdumps in MAT show 60-99% heapusage of compaction thread. Here some log extract: INFO [CompactionExecutor:1] 2010-11-26 14:18:18,593 CompactionIterator.java (line 134) Compacting large row 6650325572717566efbfbf44545241434b53efbfbf31 (172967291 bytes) incrementally INFO [ScheduledTasks:1] 2010-11-26 14:18:41,421 GCInspector.java (line 133) GC for ParNew: 365 ms, 54551328 reclaimed leaving 459496840 used; max is 1450442752 INFO [ScheduledTasks:1] 2010-11-26 14:18:42,437 GCInspector.java (line 133) GC for ParNew: 226 ms, 12469104 reclaimed leaving 554506776 used; max is 1450442752 INFO [ScheduledTasks:1] 2010-11-26 14:18:43,453 GCInspector.java (line 133) GC for ParNew: 224 ms, 12777840 reclaimed leaving 649207976 used; max is 1450442752 INFO [ScheduledTasks:1] 2010-11-26 14:18:44,468 GCInspector.java (line 133) GC for ParNew: 225 ms, 12564144 reclaimed leaving 744122872 used; max is 1450442752 INFO [ScheduledTasks:1] 2010-11-26 14:18:45,468 GCInspector.java (line 133) GC for ParNew: 222 ms, 16020328 reclaimed leaving 835581584 used; max is 1450442752 INFO [ScheduledTasks:1] 2010-11-26 14:18:46,468 GCInspector.java (line 133) GC for ParNew: 226 ms, 12697912 reclaimed leaving 930362712 used; max is 1450442752 INFO [ScheduledTasks:1] 2010-11-26 14:18:47,468 GCInspector.java (line 133) GC for ParNew: 227 ms, 15816872 reclaimed leaving 1022026288 used; max is 1450442752 INFO [ScheduledTasks:1] 2010-11-26 14:18:48,484 GCInspector.java (line 133) GC for ParNew: 258 ms, 12746584 reclaimed leaving 1116758744 used; max is 1450442752 INFO [ScheduledTasks:1] 2010-11-26 14:18:49,484 GCInspector.java (line 133) GC for ParNew: 257 ms, 12802608 reclaimed leaving 1211435176 used; max is 1450442752 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java (line 133) GC for ConcurrentMarkSweep: 4188 ms, 271308512 reclaimed leaving 1047605704 used; max is 1450442752 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java (line 153) Pool NameActive Pending INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java (line 160) ResponseStage 0 0 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java (line 160) ReadStage 0 0 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java (line 160) ReadRepair0 0 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java (line 160) MutationStage 0 0 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java (line 160) GossipStage 0 0 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java (line 160) AntientropyStage 0 0 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,562 GCInspector.java (line 160) MigrationStage0 0 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,562 GCInspector.java (line 160) StreamStage 0 0 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,562 GCInspector.java (line 160) MemtablePostFlusher 0 0 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,562 GCInspector.java (line 160) FlushWriter 0 0 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,562 GCInspector.java (line 160) MiscStage
The key list of multiget_slice's parameter has been changed unexpectedly.
Hi everyone, we are working on a Java product based on Cassandra since 0.5, and Cassandra made a very huge change in 0.7 beta 2, which changes all byte array into ByteBuffers, and we found this problem which confuses us a lot, here's the detail about what happened: The multiget_slice method in Cassandra.Iface indicated that it requires a list of keys for multi get slice query, which we believed we have to give every individual keys to get the data we need, and according to the Java doc, we will get a Map result, which uses a ByteBuffer as key and ColunmOrSuperColumn as value, we made a guess that the ByteBuffer is the key we send for query, in the case above, the result Map should looks like if we give a key list A,B,C : Key of A - Data of A Key of B - Data of B Key of C - Data of C In order to get Data of A from the result map, all we need to do is perform a resultMap.get(A), but we got problem here: The result map's key is something else, it's not the key we gave before, in the case above, it's no longer a list of A,B,C while the value is exactly the data we need, but it's very troublesome we are unable to find the corresponding data from the key. We made a guess that the key ByteBuffers has been changed in the query process due to call by reference, and we found this in the server's source code which looks like that the key has been changed unexpectedly in org.apache.cassandra.thrift.CassandraServer's getSlice method: columnFamilies.get(StorageService.getPartitioner().decorateKey(command.key)); Looks like the key has been decorated for some purpose, and it's has been changed in the process due to the nature of ByteBuffer, and the decorated key has been used as the key in the result map. columnFamiliesMap.put(command.key, thriftifiedColumns); Are we misinterpreted the Java Doc API or is this is a bug?
Re: The key list of multiget_slice's parameter has been changed unexpectedly.
You should start by trying 0.7 RC1. Some bugs with the use of ByteBuffers have been corrected since beta2. If you still have problem, then it's likely a bug, the byteBuffer should not be changed from under you. If it still doesn't work with RC1, it would be very helpful if you can provide a simple script that reproduce the behavior you describe. On Mon, Nov 29, 2010 at 12:07 PM, eggli aeg...@gmail.com wrote: Hi everyone, we are working on a Java product based on Cassandra since 0.5, and Cassandra made a very huge change in 0.7 beta 2, which changes all byte array into ByteBuffers, and we found this problem which confuses us a lot, here's the detail about what happened: The multiget_slice method in Cassandra.Iface indicated that it requires a list of keys for multi get slice query, which we believed we have to give every individual keys to get the data we need, and according to the Java doc, we will get a Map result, which uses a ByteBuffer as key and ColunmOrSuperColumn as value, we made a guess that the ByteBuffer is the key we send for query, in the case above, the result Map should looks like if we give a key list A,B,C : Key of A - Data of A Key of B - Data of B Key of C - Data of C In order to get Data of A from the result map, all we need to do is perform a resultMap.get(A), but we got problem here: The result map's key is something else, it's not the key we gave before, in the case above, it's no longer a list of A,B,C while the value is exactly the data we need, but it's very troublesome we are unable to find the corresponding data from the key. We made a guess that the key ByteBuffers has been changed in the query process due to call by reference, and we found this in the server's source code which looks like that the key has been changed unexpectedly in org.apache.cassandra.thrift.CassandraServer's getSlice method: columnFamilies.get(StorageService.getPartitioner().decorateKey(command.key)); Looks like the key has been decorated for some purpose, and it's has been changed in the process due to the nature of ByteBuffer, and the decorated key has been used as the key in the result map. columnFamiliesMap.put(command.key, thriftifiedColumns); Are we misinterpreted the Java Doc API or is this is a bug?
Re: Booting Cassandra v0.7.0 on Windows: rename failed
Windows is notoriously bad about hanging on to file handles. Make sure there are no explorer windows or command line windows open to d:\cassandra\data\system\, and then hope for the best. Gary. On Mon, Nov 29, 2010 at 02:49, Ramon Rockx r.ro...@asknow.nl wrote: Hi, Recently I downloaded Cassandra v0.7.0 rc1. When I try to run cassandra it ends with the following logging: INFO 09:17:30,044 Enqueuing flush of memtable-locationi...@839514767(643 bytes, 12 operations) INFO 09:17:30,045 Writing memtable-locationi...@839514767(643 bytes, 12 operations) ERROR 09:17:30,233 Fatal exception in thread Thread[FlushWriter:1,5,main] java.io.IOError: java.io.IOException: rename failed of d:\cassandra\data\system\LocationInfo-e-1-Data.db at org.apache.cassandra.io.sstable.SSTableWriter.rename(SSTableWriter.java: 214) at org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTable Writer.java:184) at org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTable Writer.java:167) at org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:161) at org.apache.cassandra.db.Memtable.access$000(Memtable.java:49) at org.apache.cassandra.db.Memtable$1.runMayThrow(Memtable.java:174) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto r.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja va:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.io.IOException: rename failed of d:\cassandra\data\system\LocationInfo-e-1-Data.db at org.apache.cassandra.utils.FBUtilities.renameWithConfirm(FBUtilities.jav a:359) at org.apache.cassandra.io.sstable.SSTableWriter.rename(SSTableWriter.java: 210) ... 12 more Operating system is Windows 7. Tried it also on Windows 2003 server. I only modified a few (necessary) path settings in cassandra.yaml: commitlog_directory: d:/cassandra/commitlog data_file_directories: - d:/cassandra/data saved_caches_directory: d:/cassandra/saved_caches Does anybody know what I'm doing wrong? Regards, Ramon
Re: Booting Cassandra v0.7.0 on Windows: rename failed
Please report a bug at https://issues.apache.org/jira/browse/CASSANDRA On Mon, Nov 29, 2010 at 2:49 AM, Ramon Rockx r.ro...@asknow.nl wrote: Hi, Recently I downloaded Cassandra v0.7.0 rc1. When I try to run cassandra it ends with the following logging: INFO 09:17:30,044 Enqueuing flush of memtable-locationi...@839514767(643 bytes, 12 operations) INFO 09:17:30,045 Writing memtable-locationi...@839514767(643 bytes, 12 operations) ERROR 09:17:30,233 Fatal exception in thread Thread[FlushWriter:1,5,main] java.io.IOError: java.io.IOException: rename failed of d:\cassandra\data\system\LocationInfo-e-1-Data.db at org.apache.cassandra.io.sstable.SSTableWriter.rename(SSTableWriter.java: 214) at org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTable Writer.java:184) at org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTable Writer.java:167) at org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:161) at org.apache.cassandra.db.Memtable.access$000(Memtable.java:49) at org.apache.cassandra.db.Memtable$1.runMayThrow(Memtable.java:174) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto r.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja va:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.io.IOException: rename failed of d:\cassandra\data\system\LocationInfo-e-1-Data.db at org.apache.cassandra.utils.FBUtilities.renameWithConfirm(FBUtilities.jav a:359) at org.apache.cassandra.io.sstable.SSTableWriter.rename(SSTableWriter.java: 210) ... 12 more Operating system is Windows 7. Tried it also on Windows 2003 server. I only modified a few (necessary) path settings in cassandra.yaml: commitlog_directory: d:/cassandra/commitlog data_file_directories: - d:/cassandra/data saved_caches_directory: d:/cassandra/saved_caches Does anybody know what I'm doing wrong? Regards, Ramon -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Solr DataImportHandler (DIH) and Cassandra
Is there anyway to use DIH to import from Cassandra? Thanks
RE: Booting Cassandra v0.7.0 on Windows: rename failed
This isn't a first time Cassandra has I/O issues on Windows. I think it's not easy to review source code and eliminate such issues, but would like developers to keep in mind such issues in the future. We're also running a Cassandra cluster on Windows, but 0.7 beta1 (with similar issue, but for Commit Log) and waiting for 0.7 release to use it fully on production. Viktor -Original Message- From: Jonathan Ellis [mailto:jbel...@gmail.com] Sent: Monday, November 29, 2010 5:09 PM To: user Subject: Re: Booting Cassandra v0.7.0 on Windows: rename failed Please report a bug at https://issues.apache.org/jira/browse/CASSANDRA On Mon, Nov 29, 2010 at 2:49 AM, Ramon Rockx r.ro...@asknow.nl wrote: Hi, Recently I downloaded Cassandra v0.7.0 rc1. When I try to run cassandra it ends with the following logging: INFO 09:17:30,044 Enqueuing flush of memtable-locationi...@839514767(643 bytes, 12 operations) INFO 09:17:30,045 Writing memtable-locationi...@839514767(643 bytes, 12 operations) ERROR 09:17:30,233 Fatal exception in thread Thread[FlushWriter:1,5,main] java.io.IOError: java.io.IOException: rename failed of d:\cassandra\data\system\LocationInfo-e-1-Data.db at org.apache.cassandra.io.sstable.SSTableWriter.rename(SSTableWriter.java: 214) at org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTable Writer.java:184) at org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTable Writer.java:167) at org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:161) at org.apache.cassandra.db.Memtable.access$000(Memtable.java:49) at org.apache.cassandra.db.Memtable$1.runMayThrow(Memtable.java:174) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto r.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja va:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.io.IOException: rename failed of d:\cassandra\data\system\LocationInfo-e-1-Data.db at org.apache.cassandra.utils.FBUtilities.renameWithConfirm(FBUtilities.jav a:359) at org.apache.cassandra.io.sstable.SSTableWriter.rename(SSTableWriter.java: 210) ... 12 more Operating system is Windows 7. Tried it also on Windows 2003 server. I only modified a few (necessary) path settings in cassandra.yaml: commitlog_directory: d:/cassandra/commitlog data_file_directories: - d:/cassandra/data saved_caches_directory: d:/cassandra/saved_caches Does anybody know what I'm doing wrong? Regards, Ramon -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
RE: Booting Cassandra v0.7.0 on Windows: rename failed
I've run into this as well. Having confirmed that there are no handles on the file (it's only ever created and used by Cassandra), and having stepped through the code, I've concluded that something in the io (not sure if it's the jvm or the os) stack is lazy about releasing the file handle for 'RandomAccessFile's. I was able to get past these issues by setting a breakpoint after the call to close (on the file-to-be-renamed), waiting 30 seconds, then resuming the thread. Basically, Cassandra won't start on windows 7 in its current state. AD -Original Message- From: Viktor Jevdokimov [mailto:viktor.jevdoki...@adform.com] Sent: Monday, November 29, 2010 10:13 AM To: user@cassandra.apache.org Subject: RE: Booting Cassandra v0.7.0 on Windows: rename failed This isn't a first time Cassandra has I/O issues on Windows. I think it's not easy to review source code and eliminate such issues, but would like developers to keep in mind such issues in the future. We're also running a Cassandra cluster on Windows, but 0.7 beta1 (with similar issue, but for Commit Log) and waiting for 0.7 release to use it fully on production. Viktor -Original Message- From: Jonathan Ellis [mailto:jbel...@gmail.com] Sent: Monday, November 29, 2010 5:09 PM To: user Subject: Re: Booting Cassandra v0.7.0 on Windows: rename failed Please report a bug at https://issues.apache.org/jira/browse/CASSANDRA On Mon, Nov 29, 2010 at 2:49 AM, Ramon Rockx r.ro...@asknow.nl wrote: Hi, Recently I downloaded Cassandra v0.7.0 rc1. When I try to run cassandra it ends with the following logging: INFO 09:17:30,044 Enqueuing flush of memtable-locationi...@839514767(643 bytes, 12 operations) INFO 09:17:30,045 Writing memtable-locationi...@839514767(643 bytes, 12 operations) ERROR 09:17:30,233 Fatal exception in thread Thread[FlushWriter:1,5,main] java.io.IOError: java.io.IOException: rename failed of d:\cassandra\data\system\LocationInfo-e-1-Data.db at org.apache.cassandra.io.sstable.SSTableWriter.rename(SSTableWriter.java: 214) at org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTable Writer.java:184) at org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTable Writer.java:167) at org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:161) at org.apache.cassandra.db.Memtable.access$000(Memtable.java:49) at org.apache.cassandra.db.Memtable$1.runMayThrow(Memtable.java:174) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto r.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja va:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.io.IOException: rename failed of d:\cassandra\data\system\LocationInfo-e-1-Data.db at org.apache.cassandra.utils.FBUtilities.renameWithConfirm(FBUtilities.jav a:359) at org.apache.cassandra.io.sstable.SSTableWriter.rename(SSTableWriter.java: 210) ... 12 more Operating system is Windows 7. Tried it also on Windows 2003 server. I only modified a few (necessary) path settings in cassandra.yaml: commitlog_directory: d:/cassandra/commitlog data_file_directories: - d:/cassandra/data saved_caches_directory: d:/cassandra/saved_caches Does anybody know what I'm doing wrong? Regards, Ramon -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
unsubscribe
Re: word_count example fails in multi-node configuration
Roman: I logged a jira ticket about this for further investigation, if you'd like to follow that. https://issues.apache.org/jira/browse/CASSANDRA-1787 On Nov 29, 2010, at 3:14 AM, RS wrote: Hi guys, I am trying to run word_count example from contrib directory (0.7 beta 3 and 0.7.0 rc 1). It works fine in a single-node configuration, but fails with 2+ nodes. It fails in the assert statement, which caused problems before (https://issues.apache.org/jira/browse/CASSANDRA-1700). Here's a simple ring I have and error messages. --- Address Status State LoadOwnsToken 143797990709940316224804537595633718982 127.0.0.2 Up Normal 40.2 KB 51.38% 61078635599166706937511052402724559481 127.0.0.1 Up Normal 36.01 KB48.62% 143797990709940316224804537595633718982 --- [SERVER SIDE] ERROR 17:39:57,098 Fatal exception in thread Thread[ReadStage:4,5,main] java.lang.AssertionError: (143797990709940316224804537595633718982,61078635599166706937511052402724559481] at org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1273) at org.apache.cassandra.service.RangeSliceVerbHandler.doVerb(RangeSliceVerbHandler.java:48) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) --- [CLIENT_SIDE] java.lang.RuntimeException: org.apache.thrift.TApplicationException: Internal error processing get_range_slices at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:277) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:292) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:189) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:148) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) Caused by: org.apache.thrift.TApplicationException: Internal error processing get_range_slices at org.apache.thrift.TApplicationException.read(TApplicationException.java:108) at org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:724) at org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:704) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:255) ... 11 more --- Looks like tokens used in ColumnFamilySplits (ColumnFamilyInputFormat.java) are on wrapping ranges (left_token right_token). Any ideas how to fix this? -- Regards, Roman
Re: Issues getting JNA to work correctly under centos 5.5 using cassandra 0.7.0-rc1 and JNA 2.7.3
What does the current line(s) in limits.conf look like? On Mon, Nov 29, 2010 at 2:01 AM, jasonmp...@gmail.com wrote: I checked and /etc/security/limits.conf on redhat supports zero (0) to mean unlimited. Here is the sample from the man page. Notice the soft core entry. EXAMPLES These are some example lines which might be specified in /etc/security/limits.conf. * soft core 0 * hard rss 1 @student hard nproc 20 @faculty soft nproc 20 @faculty hard nproc 50 ftp hard nproc 0 @student - maxlogins 4 On Mon, Nov 29, 2010 at 6:51 AM, Jason Pell jasonmp...@gmail.com wrote: Ok that's a good point i will check - I am not sure. Sent from my iPhone On Nov 29, 2010, at 5:53, Tyler Hobbs ty...@riptano.com wrote: I'm not familiar with ulimit on RedHat systems, but are you sure you have ulimit set correctly? Did you set it to '0' or 'unlimited'? I ask because on a Debian system, I get this: tho...@~ $ ulimit -l unlimited Where you said that you got back '0'. - Tyler On Sun, Nov 28, 2010 at 1:15 AM, Jason Pell ja...@pellcorp.com wrote: Hi, I have selinux disabled via /etc/sysconfig/selinux already. But I did as you suggested anyway, even restarted the whole machine again too and still no difference. Do you know if there is a way to discover exactly what this error means? THanks Jason On Sat, Nov 27, 2010 at 3:59 AM, Nate McCall n...@riptano.com wrote: This might be an issue with selinux. You can try this quickly to temporarily disable selinux enforcement: /usr/sbin/setenforce 0 (as root) and then start cassandra as your user. On Fri, Nov 26, 2010 at 1:00 AM, Jason Pell jasonmp...@gmail.com wrote: I restarted the box :-) so it's well and truly set Sent from my iPhone On Nov 26, 2010, at 17:57, Brandon Williams dri...@gmail.com wrote: On Thu, Nov 25, 2010 at 10:02 PM, Jason Pell ja...@pellcorp.com wrote: Hi, I have set the memlock limit to unlimited in /etc/security/limits.conf [devel...@localhost apache-cassandra-0.7.0-rc1]$ ulimit -l 0 Running as a non root user gets me a Unknown mlockall error 1 Have you tried logging out and back in after changing limits.conf? -Brandon
Re: word_count example fails in multi-node configuration
Roman: I commented on the ticket - would you mind answering on there? https://issues.apache.org/jira/browse/CASSANDRA-1787 Tx, Jeremy On Nov 29, 2010, at 3:14 AM, RS wrote: Hi guys, I am trying to run word_count example from contrib directory (0.7 beta 3 and 0.7.0 rc 1). It works fine in a single-node configuration, but fails with 2+ nodes. It fails in the assert statement, which caused problems before (https://issues.apache.org/jira/browse/CASSANDRA-1700). Here's a simple ring I have and error messages. --- Address Status State LoadOwnsToken 143797990709940316224804537595633718982 127.0.0.2 Up Normal 40.2 KB 51.38% 61078635599166706937511052402724559481 127.0.0.1 Up Normal 36.01 KB48.62% 143797990709940316224804537595633718982 --- [SERVER SIDE] ERROR 17:39:57,098 Fatal exception in thread Thread[ReadStage:4,5,main] java.lang.AssertionError: (143797990709940316224804537595633718982,61078635599166706937511052402724559481] at org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1273) at org.apache.cassandra.service.RangeSliceVerbHandler.doVerb(RangeSliceVerbHandler.java:48) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) --- [CLIENT_SIDE] java.lang.RuntimeException: org.apache.thrift.TApplicationException: Internal error processing get_range_slices at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:277) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:292) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:189) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:148) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) Caused by: org.apache.thrift.TApplicationException: Internal error processing get_range_slices at org.apache.thrift.TApplicationException.read(TApplicationException.java:108) at org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:724) at org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:704) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:255) ... 11 more --- Looks like tokens used in ColumnFamilySplits (ColumnFamilyInputFormat.java) are on wrapping ranges (left_token right_token). Any ideas how to fix this? -- Regards, Roman
Re: word_count example fails in multi-node configuration
So final answer - known issue with RC1 - https://issues.apache.org/jira/browse/CASSANDRA-1781 - that should be fixed before 0.7.0 is completed. On Nov 29, 2010, at 11:31 AM, Jeremy Hanna wrote: Roman: I logged a jira ticket about this for further investigation, if you'd like to follow that. https://issues.apache.org/jira/browse/CASSANDRA-1787 On Nov 29, 2010, at 3:14 AM, RS wrote: Hi guys, I am trying to run word_count example from contrib directory (0.7 beta 3 and 0.7.0 rc 1). It works fine in a single-node configuration, but fails with 2+ nodes. It fails in the assert statement, which caused problems before (https://issues.apache.org/jira/browse/CASSANDRA-1700). Here's a simple ring I have and error messages. --- Address Status State LoadOwnsToken 143797990709940316224804537595633718982 127.0.0.2 Up Normal 40.2 KB 51.38% 61078635599166706937511052402724559481 127.0.0.1 Up Normal 36.01 KB48.62% 143797990709940316224804537595633718982 --- [SERVER SIDE] ERROR 17:39:57,098 Fatal exception in thread Thread[ReadStage:4,5,main] java.lang.AssertionError: (143797990709940316224804537595633718982,61078635599166706937511052402724559481] at org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1273) at org.apache.cassandra.service.RangeSliceVerbHandler.doVerb(RangeSliceVerbHandler.java:48) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) --- [CLIENT_SIDE] java.lang.RuntimeException: org.apache.thrift.TApplicationException: Internal error processing get_range_slices at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:277) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:292) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:189) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:148) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) Caused by: org.apache.thrift.TApplicationException: Internal error processing get_range_slices at org.apache.thrift.TApplicationException.read(TApplicationException.java:108) at org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:724) at org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:704) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:255) ... 11 more --- Looks like tokens used in ColumnFamilySplits (ColumnFamilyInputFormat.java) are on wrapping ranges (left_token right_token). Any ideas how to fix this? -- Regards, Roman
Re: Issues getting JNA to work correctly under centos 5.5 using cassandra 0.7.0-rc1 and JNA 2.7.3
* - memlock 0 On Tue, Nov 30, 2010 at 4:40 AM, Nate McCall n...@riptano.com wrote: What does the current line(s) in limits.conf look like? On Mon, Nov 29, 2010 at 2:01 AM, jasonmp...@gmail.com wrote: I checked and /etc/security/limits.conf on redhat supports zero (0) to mean unlimited. Here is the sample from the man page. Notice the soft core entry. EXAMPLES These are some example lines which might be specified in /etc/security/limits.conf. * soft core 0 * hard rss 1 @student hard nproc 20 @faculty soft nproc 20 @faculty hard nproc 50 ftp hard nproc 0 @student - maxlogins 4 On Mon, Nov 29, 2010 at 6:51 AM, Jason Pell jasonmp...@gmail.com wrote: Ok that's a good point i will check - I am not sure. Sent from my iPhone On Nov 29, 2010, at 5:53, Tyler Hobbs ty...@riptano.com wrote: I'm not familiar with ulimit on RedHat systems, but are you sure you have ulimit set correctly? Did you set it to '0' or 'unlimited'? I ask because on a Debian system, I get this: tho...@~ $ ulimit -l unlimited Where you said that you got back '0'. - Tyler On Sun, Nov 28, 2010 at 1:15 AM, Jason Pell ja...@pellcorp.com wrote: Hi, I have selinux disabled via /etc/sysconfig/selinux already. But I did as you suggested anyway, even restarted the whole machine again too and still no difference. Do you know if there is a way to discover exactly what this error means? THanks Jason On Sat, Nov 27, 2010 at 3:59 AM, Nate McCall n...@riptano.com wrote: This might be an issue with selinux. You can try this quickly to temporarily disable selinux enforcement: /usr/sbin/setenforce 0 (as root) and then start cassandra as your user. On Fri, Nov 26, 2010 at 1:00 AM, Jason Pell jasonmp...@gmail.com wrote: I restarted the box :-) so it's well and truly set Sent from my iPhone On Nov 26, 2010, at 17:57, Brandon Williams dri...@gmail.com wrote: On Thu, Nov 25, 2010 at 10:02 PM, Jason Pell ja...@pellcorp.com wrote: Hi, I have set the memlock limit to unlimited in /etc/security/limits.conf [devel...@localhost apache-cassandra-0.7.0-rc1]$ ulimit -l 0 Running as a non root user gets me a Unknown mlockall error 1 Have you tried logging out and back in after changing limits.conf? -Brandon
Re: Issues getting JNA to work correctly under centos 5.5 using cassandra 0.7.0-rc1 and JNA 2.7.3
Ok, I was able to reproduce this with 0 as the value. Changing it to unlimited will make this go away. A closer reading of the limits.conf man page seems to leave some ambiguity when taken with the examples: All items support the values -1, unlimited or infinity indicating no limit, except for priority and nice. I would recommend tightening this to a specific user. The line I ended up with for the cassandra user was: cassandra- memlock unlimited You probably want to add a line for nofile in there at ~ 16384 as well while your there as that can be an issue depending on load. On Mon, Nov 29, 2010 at 1:59 PM, Jason Pell ja...@pellcorp.com wrote: * - memlock 0 On Tue, Nov 30, 2010 at 4:40 AM, Nate McCall n...@riptano.com wrote: What does the current line(s) in limits.conf look like? On Mon, Nov 29, 2010 at 2:01 AM, jasonmp...@gmail.com wrote: I checked and /etc/security/limits.conf on redhat supports zero (0) to mean unlimited. Here is the sample from the man page. Notice the soft core entry. EXAMPLES These are some example lines which might be specified in /etc/security/limits.conf. * soft core 0 * hard rss 1 @student hard nproc 20 @faculty soft nproc 20 @faculty hard nproc 50 ftp hard nproc 0 @student - maxlogins 4 On Mon, Nov 29, 2010 at 6:51 AM, Jason Pell jasonmp...@gmail.com wrote: Ok that's a good point i will check - I am not sure. Sent from my iPhone On Nov 29, 2010, at 5:53, Tyler Hobbs ty...@riptano.com wrote: I'm not familiar with ulimit on RedHat systems, but are you sure you have ulimit set correctly? Did you set it to '0' or 'unlimited'? I ask because on a Debian system, I get this: tho...@~ $ ulimit -l unlimited Where you said that you got back '0'. - Tyler On Sun, Nov 28, 2010 at 1:15 AM, Jason Pell ja...@pellcorp.com wrote: Hi, I have selinux disabled via /etc/sysconfig/selinux already. But I did as you suggested anyway, even restarted the whole machine again too and still no difference. Do you know if there is a way to discover exactly what this error means? THanks Jason On Sat, Nov 27, 2010 at 3:59 AM, Nate McCall n...@riptano.com wrote: This might be an issue with selinux. You can try this quickly to temporarily disable selinux enforcement: /usr/sbin/setenforce 0 (as root) and then start cassandra as your user. On Fri, Nov 26, 2010 at 1:00 AM, Jason Pell jasonmp...@gmail.com wrote: I restarted the box :-) so it's well and truly set Sent from my iPhone On Nov 26, 2010, at 17:57, Brandon Williams dri...@gmail.com wrote: On Thu, Nov 25, 2010 at 10:02 PM, Jason Pell ja...@pellcorp.com wrote: Hi, I have set the memlock limit to unlimited in /etc/security/limits.conf [devel...@localhost apache-cassandra-0.7.0-rc1]$ ulimit -l 0 Running as a non root user gets me a Unknown mlockall error 1 Have you tried logging out and back in after changing limits.conf? -Brandon
Re: Issues getting JNA to work correctly under centos 5.5 using cassandra 0.7.0-rc1 and JNA 2.7.3
Awesome thanks will make the changes So is the man page inaccurate? Or is jna doing something wrong? Sent from my iPhone On Nov 30, 2010, at 7:28, Nate McCall n...@riptano.com wrote: Ok, I was able to reproduce this with 0 as the value. Changing it to unlimited will make this go away. A closer reading of the limits.conf man page seems to leave some ambiguity when taken with the examples: All items support the values -1, unlimited or infinity indicating no limit, except for priority and nice. I would recommend tightening this to a specific user. The line I ended up with for the cassandra user was: cassandra- memlock unlimited You probably want to add a line for nofile in there at ~ 16384 as well while your there as that can be an issue depending on load. On Mon, Nov 29, 2010 at 1:59 PM, Jason Pell ja...@pellcorp.com wrote: * - memlock 0 On Tue, Nov 30, 2010 at 4:40 AM, Nate McCall n...@riptano.com wrote: What does the current line(s) in limits.conf look like? On Mon, Nov 29, 2010 at 2:01 AM, jasonmp...@gmail.com wrote: I checked and /etc/security/limits.conf on redhat supports zero (0) to mean unlimited. Here is the sample from the man page. Notice the soft core entry. EXAMPLES These are some example lines which might be specified in /etc/security/limits.conf. * softcore0 * hardrss 1 @studenthardnproc 20 @facultysoftnproc 20 @facultyhardnproc 50 ftp hardnproc 0 @student- maxlogins 4 On Mon, Nov 29, 2010 at 6:51 AM, Jason Pell jasonmp...@gmail.com wrote: Ok that's a good point i will check - I am not sure. Sent from my iPhone On Nov 29, 2010, at 5:53, Tyler Hobbs ty...@riptano.com wrote: I'm not familiar with ulimit on RedHat systems, but are you sure you have ulimit set correctly? Did you set it to '0' or 'unlimited'? I ask because on a Debian system, I get this: tho...@~ $ ulimit -l unlimited Where you said that you got back '0'. - Tyler On Sun, Nov 28, 2010 at 1:15 AM, Jason Pell ja...@pellcorp.com wrote: Hi, I have selinux disabled via /etc/sysconfig/selinux already. But I did as you suggested anyway, even restarted the whole machine again too and still no difference. Do you know if there is a way to discover exactly what this error means? THanks Jason On Sat, Nov 27, 2010 at 3:59 AM, Nate McCall n...@riptano.com wrote: This might be an issue with selinux. You can try this quickly to temporarily disable selinux enforcement: /usr/sbin/setenforce 0 (as root) and then start cassandra as your user. On Fri, Nov 26, 2010 at 1:00 AM, Jason Pell jasonmp...@gmail.com wrote: I restarted the box :-) so it's well and truly set Sent from my iPhone On Nov 26, 2010, at 17:57, Brandon Williams dri...@gmail.com wrote: On Thu, Nov 25, 2010 at 10:02 PM, Jason Pell ja...@pellcorp.com wrote: Hi, I have set the memlock limit to unlimited in /etc/security/limits.conf [devel...@localhost apache-cassandra-0.7.0-rc1]$ ulimit -l 0 Running as a non root user gets me a Unknown mlockall error 1 Have you tried logging out and back in after changing limits.conf? -Brandon
Re: Cassandra 0.7 beta 3 outOfMemory (OOM)
Sounds like you need to increase the Heap size and/or reduce the memtable_throughput_in_mb and/or turn off the internal caches. Normally the binary memtable thresholds only apply to bulk load operations and it's the per CF memtable_* settings you want to change. I'm notfamiliarwith lucandra though.See the section on JVM Heap Size herehttp://wiki.apache.org/cassandra/MemtableThresholdsBottom line is you will need more JVM heap memory.Hope that helps.AaronOn 29 Nov, 2010,at 10:28 PM, cassan...@ajowa.de wrote:Hi community, during my tests i had several OOM crashes. Getting some hints to find the problem would be nice. First cassandra crashes after about 45 min insert test script During the following tests time to OOM was shorter until it started to crash even in "idle" mode. Here the facts: - cassandra 0.7 beta 3 - using lucandra to index about 3 million files ~1kb data - inserting with one client to one cassandra node with about 200 files/s - cassandra data files for this keyspace grow up to about 20 GB - the keyspace only contains the two lucandra specific CFs Cluster: - cassandra single node on windows 32bit, Xeon 2,5 Ghz, 4GB RAM - java jre 1.6.0_22 - heap space first 1GB, later increased to 1,3 GB Cassandra.yaml: default + reduced "binary_memtable_throughput_in_mb" to 128 CFs: default + reduced min_compaction_threshold: 4 max_compaction_threshold: 8 I think the problem appears always during compaction, and perhaps it is a result of large rows (some about 170mb). Are there more optionions we could use to work with few memory? Is it a problem of compaction? And how to avoid? Slower inserts? More memory? Even fewer memtable_throuput or in_memory_compaction_limit? Continuous manual major comapction? I've read http://www.riptano.com/docs/0.6/troubleshooting/index#nodes-are-dying-with-oom-errors - row_size should be fixed since 0.7 and 200mb is still far away from 2gb - only key cache is used a little bit 3600/2 - after a lot of writes cassandra crashes even in idle mode - memtablesize was reduced and there are only 2 CFs Several heapdumps in MAT show 60-99% heapusage of compaction thread. Here some log extract: INFO [CompactionExecutor:1] 2010-11-26 14:18:18,593 CompactionIterator.java (line 134) Compacting large row 6650325572717566efbfbf44545241434b53efbfbf31 (172967291 bytes) incrementally INFO [ScheduledTasks:1] 2010-11-26 14:18:41,421 GCInspector.java (line 133) GC for ParNew: 365 ms, 54551328 reclaimed leaving 459496840 used; max is 1450442752 INFO [ScheduledTasks:1] 2010-11-26 14:18:42,437 GCInspector.java (line 133) GC for ParNew: 226 ms, 12469104 reclaimed leaving 554506776 used; max is 1450442752 INFO [ScheduledTasks:1] 2010-11-26 14:18:43,453 GCInspector.java (line 133) GC for ParNew: 224 ms, 12777840 reclaimed leaving 649207976 used; max is 1450442752 INFO [ScheduledTasks:1] 2010-11-26 14:18:44,468 GCInspector.java (line 133) GC for ParNew: 225 ms, 12564144 reclaimed leaving 744122872 used; max is 1450442752 INFO [ScheduledTasks:1] 2010-11-26 14:18:45,468 GCInspector.java (line 133) GC for ParNew: 222 ms, 16020328 reclaimed leaving 835581584 used; max is 1450442752 INFO [ScheduledTasks:1] 2010-11-26 14:18:46,468 GCInspector.java (line 133) GC for ParNew: 226 ms, 12697912 reclaimed leaving 930362712 used; max is 1450442752 INFO [ScheduledTasks:1] 2010-11-26 14:18:47,468 GCInspector.java (line 133) GC for ParNew: 227 ms, 15816872 reclaimed leaving 1022026288 used; max is 1450442752 INFO [ScheduledTasks:1] 2010-11-26 14:18:48,484 GCInspector.java (line 133) GC for ParNew: 258 ms, 12746584 reclaimed leaving 1116758744 used; max is 1450442752 INFO [ScheduledTasks:1] 2010-11-26 14:18:49,484 GCInspector.java (line 133) GC for ParNew: 257 ms, 12802608 reclaimed leaving 1211435176 used; max is 1450442752 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java (line 133) GC for ConcurrentMarkSweep: 4188 ms, 271308512 reclaimed leaving 1047605704 used; max is 1450442752 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java (line 153) Pool NameActive Pending INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java (line 160) ResponseStage 0 0 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java (line 160) ReadStage 0 0 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspectorjava (line 160) ReadRepair0 0 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java (line 160) MutationStage 0 0 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java (line 160) GossipStage 0 0 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java (line 160) AntientropyStage 0 0 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,562 GCInspector.java (line 160)
Introduction to Cassandra
I did a talk last week at the Wellington Rails User Group as a basic introduction to Cassandra. The slides are herehttp://www.slideshare.net/aaronmorton/well-railedcassandra24112010-5901169if anyone is interested.CheersAaron
Re: Introduction to Cassandra
That is a lot of slides. :) Nice work! On Mon, Nov 29, 2010 at 3:11 PM, Aaron Morton aa...@thelastpickle.com wrote: I did a talk last week at the Wellington Rails User Group as a basic introduction to Cassandra. The slides are here http://www.slideshare.net/aaronmorton/well-railedcassandra24112010-5901169 if anyone is interested. Cheers Aaron -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: Issues getting JNA to work correctly under centos 5.5 using cassandra 0.7.0-rc1 and JNA 2.7.3
Hi, Thanks for that your suggestions worked a treat. I created a new cassandra user and set the value to unlimited and I get the desired log: INFO 08:49:50,204 JNA mlockall successful On Tue, Nov 30, 2010 at 7:56 AM, Jason Pell jasonmp...@gmail.com wrote: Awesome thanks will make the changes So is the man page inaccurate? Or is jna doing something wrong? Sent from my iPhone On Nov 30, 2010, at 7:28, Nate McCall n...@riptano.com wrote: Ok, I was able to reproduce this with 0 as the value. Changing it to unlimited will make this go away. A closer reading of the limits.conf man page seems to leave some ambiguity when taken with the examples: All items support the values -1, unlimited or infinity indicating no limit, except for priority and nice. I would recommend tightening this to a specific user. The line I ended up with for the cassandra user was: cassandra - memlock unlimited You probably want to add a line for nofile in there at ~ 16384 as well while your there as that can be an issue depending on load. On Mon, Nov 29, 2010 at 1:59 PM, Jason Pell ja...@pellcorp.com wrote: * - memlock 0 On Tue, Nov 30, 2010 at 4:40 AM, Nate McCall n...@riptano.com wrote: What does the current line(s) in limits.conf look like? On Mon, Nov 29, 2010 at 2:01 AM, jasonmp...@gmail.com wrote: I checked and /etc/security/limits.conf on redhat supports zero (0) to mean unlimited. Here is the sample from the man page. Notice the soft core entry. EXAMPLES These are some example lines which might be specified in /etc/security/limits.conf. * soft core 0 * hard rss 1 @student hard nproc 20 @faculty soft nproc 20 @faculty hard nproc 50 ftp hard nproc 0 @student - maxlogins 4 On Mon, Nov 29, 2010 at 6:51 AM, Jason Pell jasonmp...@gmail.com wrote: Ok that's a good point i will check - I am not sure. Sent from my iPhone On Nov 29, 2010, at 5:53, Tyler Hobbs ty...@riptano.com wrote: I'm not familiar with ulimit on RedHat systems, but are you sure you have ulimit set correctly? Did you set it to '0' or 'unlimited'? I ask because on a Debian system, I get this: tho...@~ $ ulimit -l unlimited Where you said that you got back '0'. - Tyler On Sun, Nov 28, 2010 at 1:15 AM, Jason Pell ja...@pellcorp.com wrote: Hi, I have selinux disabled via /etc/sysconfig/selinux already. But I did as you suggested anyway, even restarted the whole machine again too and still no difference. Do you know if there is a way to discover exactly what this error means? THanks Jason On Sat, Nov 27, 2010 at 3:59 AM, Nate McCall n...@riptano.com wrote: This might be an issue with selinux. You can try this quickly to temporarily disable selinux enforcement: /usr/sbin/setenforce 0 (as root) and then start cassandra as your user. On Fri, Nov 26, 2010 at 1:00 AM, Jason Pell jasonmp...@gmail.com wrote: I restarted the box :-) so it's well and truly set Sent from my iPhone On Nov 26, 2010, at 17:57, Brandon Williams dri...@gmail.com wrote: On Thu, Nov 25, 2010 at 10:02 PM, Jason Pell ja...@pellcorp.com wrote: Hi, I have set the memlock limit to unlimited in /etc/security/limits.conf [devel...@localhost apache-cassandra-0.7.0-rc1]$ ulimit -l 0 Running as a non root user gets me a Unknown mlockall error 1 Have you tried logging out and back in after changing limits.conf? -Brandon
batch_mutate vs number of write operations on CF
Hi, I am using Cassandra 0.7 beta3 and Hector. I create a mutation map. The mutation involves adding few columns for a given row. After that I use batch_mutate API to send the changes to Cassandra. Question: If there are multiple column writes on same row in a mutation_map, does Cassandra show (on JMX write count stats for CF) that as 1 write operation or as N write operations where N is the number of entries in mutation map for that row. Assume all the changes in mutation map are for one row. Thanks, Naren
Re: Solr DataImportHandler (DIH) and Cassandra
The DataSource subclass route is what I will probably be interested in. Are there are working examples of this already out there? On 11/29/10 12:32 PM, Aaron Morton wrote: AFAIK there is nothing pre-written to pull the data out for you. You should be able to create your DataSource sub class http://lucene.apache.org/solr/api/org/apache/solr/handler/dataimport/DataSource.html Using the Hector java library to pull data from Cassandra. I'm guessing you will need to consider how to perform delta imports. Perhaps using the secondary indexes in 0.7* , or maintaining your own queues or indexes to know what has changed. There is also the Lucandra project, not exactly what your after but may be of interest anyway https://github.com/tjake/Lucandra Hope that helps. Aaron On 30 Nov, 2010,at 05:04 AM, Mark static.void@gmail.com wrote: Is there anyway to use DIH to import from Cassandra? Thanks
Re: batch_mutate vs number of write operations on CF
Using batch_mutate on a single row will count as 1 write operation, even if you mutate multiple columns. Using batch_mutate on N rows will count as N write operations. - Tyler On Mon, Nov 29, 2010 at 5:58 PM, Narendra Sharma narendra.sha...@gmail.comwrote: Hi, I am using Cassandra 0.7 beta3 and Hector. I create a mutation map. The mutation involves adding few columns for a given row. After that I use batch_mutate API to send the changes to Cassandra. Question: If there are multiple column writes on same row in a mutation_map, does Cassandra show (on JMX write count stats for CF) that as 1 write operation or as N write operations where N is the number of entries in mutation map for that row. Assume all the changes in mutation map are for one row. Thanks, Naren
Re: word_count example fails in multi-node configuration
It occurs in 0.7 beta 3 and 0.7.0 rc 1. Thank you, Jeremy. I will follow the ticket. -Roman On Tue, Nov 30, 2010 at 2:50 AM, Jeremy Hanna jeremy.hanna1...@gmail.com wrote: Roman: I commented on the ticket - would you mind answering on there? https://issues.apache.org/jira/browse/CASSANDRA-1787 Tx, Jeremy On Nov 29, 2010, at 3:14 AM, RS wrote: Hi guys, I am trying to run word_count example from contrib directory (0.7 beta 3 and 0.7.0 rc 1). It works fine in a single-node configuration, but fails with 2+ nodes. It fails in the assert statement, which caused problems before (https://issues.apache.org/jira/browse/CASSANDRA-1700). Here's a simple ring I have and error messages. --- Address Status State Load Owns Token 143797990709940316224804537595633718982 127.0.0.2 Up Normal 40.2 KB 51.38% 61078635599166706937511052402724559481 127.0.0.1 Up Normal 36.01 KB 48.62% 143797990709940316224804537595633718982 --- [SERVER SIDE] ERROR 17:39:57,098 Fatal exception in thread Thread[ReadStage:4,5,main] java.lang.AssertionError: (143797990709940316224804537595633718982,61078635599166706937511052402724559481] at org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1273) at org.apache.cassandra.service.RangeSliceVerbHandler.doVerb(RangeSliceVerbHandler.java:48) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) --- [CLIENT_SIDE] java.lang.RuntimeException: org.apache.thrift.TApplicationException: Internal error processing get_range_slices at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:277) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:292) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:189) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:148) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) Caused by: org.apache.thrift.TApplicationException: Internal error processing get_range_slices at org.apache.thrift.TApplicationException.read(TApplicationException.java:108) at org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:724) at org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:704) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:255) ... 11 more --- Looks like tokens used in ColumnFamilySplits (ColumnFamilyInputFormat.java) are on wrapping ranges (left_token right_token). Any ideas how to fix this? -- Regards, Roman
Re: Re: word_count example fails in multi-node configuration
try the OrderPreservingPartitioner 2010-11-30 Bingbing Liu 发件人: RS 发送时间: 2010-11-30 09:14:38 收件人: user 抄送: 主题: Re: word_count example fails in multi-node configuration It occurs in 0.7 beta 3 and 0.7.0 rc 1. Thank you, Jeremy. I will follow the ticket. -Roman On Tue, Nov 30, 2010 at 2:50 AM, Jeremy Hanna jeremy.hanna1...@gmail.com wrote: Roman: I commented on the ticket - would you mind answering on there? https://issues.apache.org/jira/browse/CASSANDRA-1787 Tx, Jeremy On Nov 29, 2010, at 3:14 AM, RS wrote: Hi guys, I am trying to run word_count example from contrib directory (0.7 beta 3 and 0.7.0 rc 1). It works fine in a single-node configuration, but fails with 2+ nodes. It fails in the assert statement, which caused problems before (https://issues.apache.org/jira/browse/CASSANDRA-1700). Here's a simple ring I have and error messages. --- Address Status State LoadOwnsToken 143797990709940316224804537595633718982 127.0.0.2 Up Normal 40.2 KB 51.38% 61078635599166706937511052402724559481 127.0.0.1 Up Normal 36.01 KB48.62% 143797990709940316224804537595633718982 --- [SERVER SIDE] ERROR 17:39:57,098 Fatal exception in thread Thread[ReadStage:4,5,main] java.lang.AssertionError: (143797990709940316224804537595633718982,61078635599166706937511052402724559481] at org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1273) at org.apache.cassandra.service.RangeSliceVerbHandler.doVerb(RangeSliceVerbHandler.java:48) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) --- [CLIENT_SIDE] java.lang.RuntimeException: org.apache.thrift.TApplicationException: Internal error processing get_range_slices at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:277) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:292) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:189) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:148) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) Caused by: org.apache.thrift.TApplicationException: Internal error processing get_range_slices at org.apache.thrift.TApplicationException.read(TApplicationException.java:108) at org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:724) at org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:704) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:255) ... 11 more --- Looks like tokens used in ColumnFamilySplits (ColumnFamilyInputFormat.java) are on wrapping ranges (left_token right_token). Any ideas how to fix this? -- Regards, Roman
Cassandra 0.7 - documentation on Secondary Indexes
Is there any documentation available on what is possible with secondary indexes? For eg - Is it possible to define secondary index on columns within a SuperColumn? - If I define a secondary index at run time, does Cassandra index all the existing data or only new data is indexed? Some documentation along with examples will be highly useful. Thanks, Naren
Re: Cassandra 0.7 - documentation on Secondary Indexes
On Mon, Nov 29, 2010 at 7:59 PM, Narendra Sharma narendra.sha...@gmail.com wrote: Is there any documentation available on what is possible with secondary indexes? Not yet. - Is it possible to define secondary index on columns within a SuperColumn? No. - If I define a secondary index at run time, does Cassandra index all the existing data or only new data is indexed? The former. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: Cassandra 0.7 - documentation on Secondary Indexes
Thanks Jonathan. Couple of more questions: 1. Is there any technical limit on the number of secondary indexes that can be created? 2. Is it possible to execute join queries spanning multiple secondary indexes? Thanks, Naren On Mon, Nov 29, 2010 at 6:02 PM, Jonathan Ellis jbel...@gmail.com wrote: On Mon, Nov 29, 2010 at 7:59 PM, Narendra Sharma narendra.sha...@gmail.com wrote: Is there any documentation available on what is possible with secondary indexes? Not yet. - Is it possible to define secondary index on columns within a SuperColumn? No. - If I define a secondary index at run time, does Cassandra index all the existing data or only new data is indexed? The former. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: Cassandra 0.7 - documentation on Secondary Indexes
On Mon, Nov 29, 2010 at 11:26 PM, Narendra Sharma narendra.sha...@gmail.com wrote: Thanks Jonathan. Couple of more questions: 1. Is there any technical limit on the number of secondary indexes that can be created? Just as with traditional databases, the more indexes there are the slower writes to that CF will be. 2. Is it possible to execute join queries spanning multiple secondary indexes? What do secondary indexes have to do with joins? -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: Cassandra 0.7 - documentation on Secondary Indexes
On Mon, Nov 29, 2010 at 9:32 PM, Jonathan Ellis jbel...@gmail.com wrote: On Mon, Nov 29, 2010 at 11:26 PM, Narendra Sharma narendra.sha...@gmail.com wrote: Thanks Jonathan. Couple of more questions: 1. Is there any technical limit on the number of secondary indexes that can be created? Just as with traditional databases, the more indexes there are the slower writes to that CF will be. 2. Is it possible to execute join queries spanning multiple secondary indexes? What do secondary indexes have to do with joins? For eg if I want to get all employees that are male and have age = 35 years. How can secondary indexes be useful in such scenario? -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: Cassandra 0.7 - documentation on Secondary Indexes
The 'employees with age = 35' scenario is exactly what they are useful for. There's a quick section in the pycassa documentation that might be useful: http://pycassa.github.com/pycassa/tutorial.html#indexes On Mon, Nov 29, 2010 at 11:41 PM, Narendra Sharma narendra.sha...@gmail.com wrote: On Mon, Nov 29, 2010 at 9:32 PM, Jonathan Ellis jbel...@gmail.com wrote: On Mon, Nov 29, 2010 at 11:26 PM, Narendra Sharma narendra.sha...@gmail.com wrote: Thanks Jonathan. Couple of more questions: 1. Is there any technical limit on the number of secondary indexes that can be created? Just as with traditional databases, the more indexes there are the slower writes to that CF will be. 2. Is it possible to execute join queries spanning multiple secondary indexes? What do secondary indexes have to do with joins? For eg if I want to get all employees that are male and have age = 35 years. How can secondary indexes be useful in such scenario? -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: Achieving isolation on single row modifications with batch_mutate
In this case, it sounds like you should combine columns A and B if you are writing them both at the same time, reading them both at the same time, and need them to be consistent. Obviously, you're probably dealing with more than two columns here, but there's generally not any value in splitting something into multiple columns if you're always writing and reading all of them at the same time. Or are you talking about chunking huge blobs across a row? - Tyler On Sat, Nov 27, 2010 at 10:12 AM, E S tr1skl...@yahoo.com wrote: I'm trying to figure out the best way to achieve single row modification isolation for readers. As an example, I have 2 rows (1,2) with 2 columns (a,b). If I modify both rows, I don't care if the user sees the write operations completed on 1 and not on 2 for a short time period (seconds). I also don't care if when reading row 1 the user gets the new value, and then on a re-read gets the old value (within a few seconds). Because of this, I have been planning on using a consistency level of one. However, if I modify both columns A,B on a single row, I need both changes on the row to be visible/invisible atomically. It doesn't matter if they both become visible and then both invisible as the data propagates across nodes, but a half-completed state on an initial read will basically be returning corrupt data given my apps consistency requirements. My understanding from the FAQ that this single row multicolumn change provides no read isolation, so I will have this problem. Is this correct? If so: Question 1: Is there a way to get this type of isolation without using a distributed locking mechanism like cages? Question 2: Are there any plans to implement this type of isolation within Cassandra? Question 3: If I went with a distributed locking mechanism, what consistency level would I need to use with Cassandra? Could I still get away with a consistency level of one? It seems that if the initial write is done in a non-isolated way, but if cross-node row synchronizations are done all or nothing, I could still use one. Question 4: Does anyone know of a good c# alternative to cages/zookeeper? Thanks for any help with this!
Re: get_count - cassandra 0.7.x predicate limit bug?
What error are you getting? Remember, get_count() is still just about as much work for cassandra as getting the whole row; the only advantage is it doesn't have to send the whole row back to the client. If you're counting 3+ million columns frequently, it's time to take a look at counters. - Tyler On Fri, Nov 26, 2010 at 10:33 AM, Marcin mar...@33concept.com wrote: Hi guys, I have a key with 3million+ columns but when I am trying to run get_count on it its getting me error if setting limit more than 46000+ any ideas? In previous API there was no predicate at all so it was simply counting number of columns now its not so simple any more. Please let me know if that is a bug or I do something wrong. cheers, /Marcin
Re: Updating Cascal
Are you sure you're using the same key for batch_mutate() and get_slice()? They appear different in the logs. - Tyler On Thu, Nov 25, 2010 at 10:14 AM, Michael Fortin mi...@m410.us wrote: Hello, I forked Cascal (Scala based client for cassandra) and I'm attempting to update it to cassandra 0.7. I have it partially working, but I'm getting stuck on a few areas. I have most of the unit tests working from the original code, but I'm having an issue with batch_mutate(keyToFamilyMutations, consistency) . Does the log output mean anything? I can't figure out why the columns are not getting inserted. If I change th code from a batch_mutate to an insert(family, parent, column, consistency) it works. ### keyToFamilyMutations: {java.nio.HeapByteBuffer[pos=0 lim=16 cap=16]={Standard=[Mutation(column_or_supercolumn:ColumnOrSuperColumn(column:Column(name:43 6F 6C 75 6D 6E 2D 61 2D 31, value:56 61 6C 75 65 2D 31, timestamp:1290662894466035))), Mutation(column_or_supercolumn:ColumnOrSuperColumn(column:Column(name:43 6F 6C 75 6D 6E 2D 61 2D 33, value:56 61 6C 75 65 2D 33, timestamp:1290662894467942))), Mutation(column_or_supercolumn:ColumnOrSuperColumn(column:Column(name:43 6F 6C 75 6D 6E 2D 61 2D 32, value:56 61 6C 75 65 2D 32, timestamp:1290662894467915)))]}} DEBUG 2010-11-25 00:28:14,534 [org.apache.cassandra.thrift.CassandraServer pool-1-thread-2] batch_mutate DEBUG 2010-11-25 00:28:14,583 [org.apache.cassandra.service.StorageProxy pool-1-thread-2] insert writing local RowMutation(keyspace='Test', key='ccfd5520f85411df858a001c4209', modifications=[Standard]) DEBUG 2010-11-25 00:28:14,599 [org.apache.cassandra.thrift.CassandraServer pool-1-thread-2] get_slice DEBUG 2010-11-25 00:28:14,605 [org.apache.cassandra.service.StorageProxy pool-1-thread-2] weakread reading SliceFromReadCommand(table='Test', key='5374616e64617264', column_parent='QueryPath(columnFamilyName='Standard', superColumnName='null', columnName='null')', start='', finish='', reversed=false, count=2147483647) locally DEBUG 2010-11-25 00:28:14,608 [org.apache.cassandra.service.StorageProxy ReadStage:2] weakreadlocal reading SliceFromReadCommand(table='Test', key='5374616e64617264', column_parent='QueryPath(columnFamilyName='Standard', superColumnName='null', columnName='null')', start='', finish='', reversed=false, count=2147483647) ### get_slice: [] The code looks like: println(keyToFamilyMutations: %s.format(keyToFamilyMutations)) client.batch_mutate(keyToFamilyMutations, consistency) … client.client.get_slice(…) keyspaces: - name: Test replica_placement_strategy: org.apache.cassandra.locator.SimpleStrategy replication_factor: 1 column_families: - {name: Standard, compare_with: BytesType} Thanks, Mike
partial matching of keys
Hi All I was wondering if it is possible to match keys partially while searching in Cassandra. I have a requirement where I'm storing a large number of records, the key being something like A|B|T where A and B are mobile numbers and T is the time-stamp (the time when A called B). Such format ensure the uniqueness of the keys. Now if I want to search for all records where A called B, I would like to do a partial match with A|B. Is this possible? I've another small question - where can I find some complete examples of creating a cluster and communicating with it (for insertion/deletion of records) using Hector or Pelops? So far, I've been doing this via the Thrift interface, but it's becoming illegible now... Thanks in advance... Regards Arijit -- And when the night is cloudy, There is still a light that shines on me, Shine on until tomorrow, let it be.
Re: Introduction to Cassandra
Really great introduction, thanks Aaron.Bookmarked for the team. J. Sent from my iPhone On 29 Nov 2010, at 21:11, Aaron Morton aa...@thelastpickle.com wrote: I did a talk last week at the Wellington Rails User Group as a basic introduction to Cassandra. The slides are here http://www.slideshare.net/aaronmorton/well-railedcassandra24112010-5901169 if anyone is interested. Cheers Aaron
Re: partial matching of keys
Yes, you can basically do this two ways: First, you can use an OrderPreservingPartitioner. This stores your keys in order, so you can grab the range of keys that begin with 'A|B'. Because of the drawbacks of OPP (unbalanced ring, hotspots), you almost certainly don't want to do this. Second, you take advantage of column name sorting. For example, you can have a row for all of the calls that A has made; each column name can be something like 'B|T'. This allows you to quickly get all of the times when A called B in chronological order. (You can have a second row or column family and swap B and T's position if you're more interested in time slices.) This is very much like the Twitter clone, Twissandra: https://github.com/ericflo/twissandra http://twissandra.com/ As for examples, there are Hector examples here: https://github.com/zznate/hector-examples - Tyler On Tue, Nov 30, 2010 at 12:11 AM, Arijit Mukherjee ariji...@gmail.comwrote: Hi All I was wondering if it is possible to match keys partially while searching in Cassandra. I have a requirement where I'm storing a large number of records, the key being something like A|B|T where A and B are mobile numbers and T is the time-stamp (the time when A called B). Such format ensure the uniqueness of the keys. Now if I want to search for all records where A called B, I would like to do a partial match with A|B. Is this possible? I've another small question - where can I find some complete examples of creating a cluster and communicating with it (for insertion/deletion of records) using Hector or Pelops? So far, I've been doing this via the Thrift interface, but it's becoming illegible now... Thanks in advance... Regards Arijit -- And when the night is cloudy, There is still a light that shines on me, Shine on until tomorrow, let it be.