Re: Issues getting JNA to work correctly under centos 5.5 using cassandra 0.7.0-rc1 and JNA 2.7.3

2010-11-29 Thread jasonmpell
I checked and /etc/security/limits.conf on redhat supports zero (0) to
mean unlimited.  Here is the sample from the man page.  Notice the
soft core entry.

EXAMPLES
   These are some example lines which might be specified in
   /etc/security/limits.conf.

   *   softcore0
   *   hardrss 1
   @studenthardnproc   20
   @facultysoftnproc   20
   @facultyhardnproc   50
   ftp hardnproc   0
   @student-   maxlogins   4



On Mon, Nov 29, 2010 at 6:51 AM, Jason Pell jasonmp...@gmail.com wrote:
 Ok that's a good point i will check - I am not sure.

 Sent from my iPhone
 On Nov 29, 2010, at 5:53, Tyler Hobbs ty...@riptano.com wrote:

 I'm not familiar with ulimit on RedHat systems, but are you sure you
 have ulimit set correctly? Did you set it to '0' or 'unlimited'?  I ask
 because on a Debian system, I get this:

 tho...@~ $ ulimit -l
 unlimited

 Where you said that you got back '0'.

 - Tyler

 On Sun, Nov 28, 2010 at 1:15 AM, Jason Pell ja...@pellcorp.com wrote:

 Hi,

 I have selinux disabled via /etc/sysconfig/selinux already.  But I did
 as you suggested anyway, even restarted the whole machine again too
 and still no difference.  Do you know if there is a way to discover
 exactly what this error means?

 THanks
 Jason

 On Sat, Nov 27, 2010 at 3:59 AM, Nate McCall n...@riptano.com wrote:
  This might be an issue with selinux. You can try this quickly to
  temporarily disable selinux enforcement:
  /usr/sbin/setenforce 0  (as root)
 
  and then start cassandra as your user.
 
  On Fri, Nov 26, 2010 at 1:00 AM, Jason Pell jasonmp...@gmail.com
  wrote:
  I restarted the box :-) so it's well and truly set
 
  Sent from my iPhone
  On Nov 26, 2010, at 17:57, Brandon Williams dri...@gmail.com wrote:
 
  On Thu, Nov 25, 2010 at 10:02 PM, Jason Pell ja...@pellcorp.com
  wrote:
 
  Hi,
 
  I have set the memlock limit to unlimited in /etc/security/limits.conf
 
  [devel...@localhost apache-cassandra-0.7.0-rc1]$ ulimit -l
  0
 
  Running as a non root user gets me a Unknown mlockall error 1
 
  Have you tried logging out and back in after changing limits.conf?
  -Brandon
 




Booting Cassandra v0.7.0 on Windows: rename failed

2010-11-29 Thread Ramon Rockx
Hi,
 
Recently I downloaded Cassandra v0.7.0 rc1. When I try to run cassandra
it ends with the following logging:
 
 INFO 09:17:30,044 Enqueuing flush of
memtable-locationi...@839514767(643 bytes, 12 operations)
 INFO 09:17:30,045 Writing memtable-locationi...@839514767(643 bytes, 12
operations)
ERROR 09:17:30,233 Fatal exception in thread
Thread[FlushWriter:1,5,main]
java.io.IOError: java.io.IOException: rename failed of
d:\cassandra\data\system\LocationInfo-e-1-Data.db
 at
org.apache.cassandra.io.sstable.SSTableWriter.rename(SSTableWriter.java:
214)
 at
org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTable
Writer.java:184)
 at
org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTable
Writer.java:167)
 at
org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:161)
 at org.apache.cassandra.db.Memtable.access$000(Memtable.java:49)
 at org.apache.cassandra.db.Memtable$1.runMayThrow(Memtable.java:174)
 at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
 at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto
r.java:886)
 at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
va:908)
 at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.IOException: rename failed of
d:\cassandra\data\system\LocationInfo-e-1-Data.db
 at
org.apache.cassandra.utils.FBUtilities.renameWithConfirm(FBUtilities.jav
a:359)
 at
org.apache.cassandra.io.sstable.SSTableWriter.rename(SSTableWriter.java:
210)
 ... 12 more

Operating system is Windows 7. Tried it also on Windows 2003 server.
I only modified a few (necessary) path settings in cassandra.yaml:

commitlog_directory: d:/cassandra/commitlog
data_file_directories:
- d:/cassandra/data
saved_caches_directory: d:/cassandra/saved_caches

Does anybody know what I'm doing wrong?

Regards,
Ramon


word_count example fails in multi-node configuration

2010-11-29 Thread RS
Hi guys,

I am trying to run word_count example from contrib directory (0.7 beta
3 and 0.7.0 rc 1).
It works fine in a single-node configuration, but fails with 2+ nodes.

It fails in the assert statement, which caused problems before
(https://issues.apache.org/jira/browse/CASSANDRA-1700).

Here's a simple ring I have and error messages.
---
Address Status State   LoadOwnsToken

143797990709940316224804537595633718982
127.0.0.2   Up Normal  40.2 KB 51.38%
61078635599166706937511052402724559481
127.0.0.1   Up Normal  36.01 KB48.62%
143797990709940316224804537595633718982
---
[SERVER SIDE]

ERROR 17:39:57,098 Fatal exception in thread Thread[ReadStage:4,5,main]
java.lang.AssertionError:
(143797990709940316224804537595633718982,61078635599166706937511052402724559481]
at 
org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1273)
at 
org.apache.cassandra.service.RangeSliceVerbHandler.doVerb(RangeSliceVerbHandler.java:48)
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
---
[CLIENT_SIDE]
java.lang.RuntimeException: org.apache.thrift.TApplicationException:
Internal error processing get_range_slices
at 
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:277)
at 
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:292)
at 
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:189)
at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
at 
org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:148)
at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423)
at 
org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
Caused by: org.apache.thrift.TApplicationException: Internal error
processing get_range_slices
at 
org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
at 
org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:724)
at 
org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:704)
at 
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:255)
... 11 more
---

Looks like tokens used in ColumnFamilySplits
(ColumnFamilyInputFormat.java) are on wrapping ranges (left_token 
right_token).
Any ideas how to fix this?

--
Regards,
Roman


Cassandra 0.7 beta 3 outOfMemory (OOM)

2010-11-29 Thread cassandra

Hi community,

during my tests i had several OOM crashes.
Getting some hints to find the problem would be nice.

First cassandra crashes after about 45 min insert test script.
During the following tests time to OOM was shorter until it started to crash
even in idle mode.

Here the facts:
- cassandra 0.7 beta 3
- using lucandra to index about 3 million files ~1kb data
- inserting with one client to one cassandra node with about 200 files/s
- cassandra data files for this keyspace grow up to about 20 GB
- the keyspace only contains the two lucandra specific CFs

Cluster:
- cassandra single node on windows 32bit, Xeon 2,5 Ghz, 4GB RAM
- java jre 1.6.0_22
- heap space first 1GB, later increased to 1,3 GB

Cassandra.yaml:
default + reduced binary_memtable_throughput_in_mb to 128

CFs:
default + reduced
min_compaction_threshold: 4
max_compaction_threshold: 8


I think the problem appears always during compaction,
and perhaps it is a result of large rows (some about 170mb).

Are there more optionions we could use to work with few memory?

Is it a problem of compaction?
And how to avoid?
Slower inserts? More memory?
Even fewer memtable_throuput or in_memory_compaction_limit?
Continuous manual major comapction?

I've read  
http://www.riptano.com/docs/0.6/troubleshooting/index#nodes-are-dying-with-oom-errors

- row_size should be fixed since 0.7 and 200mb is still far away from 2gb
- only key cache is used a little bit 3600/2
- after a lot of writes cassandra crashes even in idle mode
- memtablesize was reduced and there are only 2 CFs

Several heapdumps in MAT show 60-99% heapusage of compaction thread.

Here some log extract:

 INFO [CompactionExecutor:1] 2010-11-26 14:18:18,593  
CompactionIterator.java (line 134) Compacting large row  
6650325572717566efbfbf44545241434b53efbfbf31 (172967291 bytes)  
incrementally
 INFO [ScheduledTasks:1] 2010-11-26 14:18:41,421 GCInspector.java  
(line 133) GC for ParNew: 365 ms, 54551328 reclaimed leaving 459496840  
used; max is 1450442752
 INFO [ScheduledTasks:1] 2010-11-26 14:18:42,437 GCInspector.java  
(line 133) GC for ParNew: 226 ms, 12469104 reclaimed leaving 554506776  
used; max is 1450442752
 INFO [ScheduledTasks:1] 2010-11-26 14:18:43,453 GCInspector.java  
(line 133) GC for ParNew: 224 ms, 12777840 reclaimed leaving 649207976  
used; max is 1450442752
 INFO [ScheduledTasks:1] 2010-11-26 14:18:44,468 GCInspector.java  
(line 133) GC for ParNew: 225 ms, 12564144 reclaimed leaving 744122872  
used; max is 1450442752
 INFO [ScheduledTasks:1] 2010-11-26 14:18:45,468 GCInspector.java  
(line 133) GC for ParNew: 222 ms, 16020328 reclaimed leaving 835581584  
used; max is 1450442752
 INFO [ScheduledTasks:1] 2010-11-26 14:18:46,468 GCInspector.java  
(line 133) GC for ParNew: 226 ms, 12697912 reclaimed leaving 930362712  
used; max is 1450442752
 INFO [ScheduledTasks:1] 2010-11-26 14:18:47,468 GCInspector.java  
(line 133) GC for ParNew: 227 ms, 15816872 reclaimed leaving  
1022026288 used; max is 1450442752
 INFO [ScheduledTasks:1] 2010-11-26 14:18:48,484 GCInspector.java  
(line 133) GC for ParNew: 258 ms, 12746584 reclaimed leaving  
1116758744 used; max is 1450442752
 INFO [ScheduledTasks:1] 2010-11-26 14:18:49,484 GCInspector.java  
(line 133) GC for ParNew: 257 ms, 12802608 reclaimed leaving  
1211435176 used; max is 1450442752
 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java  
(line 133) GC for ConcurrentMarkSweep: 4188 ms, 271308512 reclaimed  
leaving 1047605704 used; max is 1450442752
 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java  
(line 153) Pool NameActive   Pending
 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java  
(line 160) ResponseStage 0 0
 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java  
(line 160) ReadStage 0 0
 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java  
(line 160) ReadRepair0 0
 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java  
(line 160) MutationStage 0 0
 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java  
(line 160) GossipStage   0 0
 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java  
(line 160) AntientropyStage  0 0
 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,562 GCInspector.java  
(line 160) MigrationStage0 0
 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,562 GCInspector.java  
(line 160) StreamStage   0 0
 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,562 GCInspector.java  
(line 160) MemtablePostFlusher   0 0
 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,562 GCInspector.java  
(line 160) FlushWriter   0 0
 INFO [ScheduledTasks:1] 2010-11-26 14:18:54,562 GCInspector.java  
(line 160) MiscStage  

The key list of multiget_slice's parameter has been changed unexpectedly.

2010-11-29 Thread eggli

Hi everyone, we are working on a Java product based on Cassandra since 0.5, and 
Cassandra made a very huge change in 0.7 beta 2, which changes all byte array 
into ByteBuffers, and we found this problem which confuses us a lot, here's the 
detail about what happened:

The multiget_slice method in Cassandra.Iface indicated that it requires a list 
of keys for multi get slice query, which we believed we have to give every 
individual keys to get the data we need, and according to the Java doc, we will 
get a Map result, which uses a  ByteBuffer as key and ColunmOrSuperColumn as 
value, we made a guess that the ByteBuffer is the key we send for query, in the 
case above, the result Map should looks like if we give a key list A,B,C :


Key of A - Data of A
Key of B - Data of B
Key of C - Data of C

In order to get Data of A from the result map, all we need to do is perform a 
resultMap.get(A), but we got problem here: The result map's key is something 
else, it's not the key we gave before, in the case above, it's no longer a list 
of A,B,C while the value is exactly the data we need, but it's very 
troublesome we are unable to find the corresponding data from the key.

We made a guess that the key ByteBuffers has been changed in the query process 
due to call by reference, and we found this in the server's source code which 
looks like that the key has been changed unexpectedly in 
org.apache.cassandra.thrift.CassandraServer's getSlice method:

columnFamilies.get(StorageService.getPartitioner().decorateKey(command.key));

Looks like the key has been decorated for some purpose, and it's has been 
changed in the process due to the nature of ByteBuffer, and the decorated key 
has been used as the key in the result map.

columnFamiliesMap.put(command.key, thriftifiedColumns);

Are we misinterpreted the Java Doc API or is this is a bug?


Re: The key list of multiget_slice's parameter has been changed unexpectedly.

2010-11-29 Thread Sylvain Lebresne
You should start by trying 0.7 RC1. Some bugs with the use of
ByteBuffers have been corrected since beta2.

If you still have problem, then it's likely a bug, the byteBuffer
should not be changed from under you.
If it still doesn't work with RC1, it would be very helpful if you can
provide a simple script that reproduce the
behavior you describe.

On Mon, Nov 29, 2010 at 12:07 PM, eggli aeg...@gmail.com wrote:

 Hi everyone, we are working on a Java product based on Cassandra since 0.5, 
 and Cassandra made a very huge change in 0.7 beta 2, which changes all byte 
 array into ByteBuffers, and we found this problem which confuses us a lot, 
 here's the detail about what happened:

 The multiget_slice method in Cassandra.Iface indicated that it requires a 
 list of keys for multi get slice query, which we believed we have to give 
 every individual keys to get the data we need, and according to the Java doc, 
 we will get a Map result, which uses a  ByteBuffer as key and 
 ColunmOrSuperColumn as value, we made a guess that the ByteBuffer is the key 
 we send for query, in the case above, the result Map should looks like if we 
 give a key list A,B,C :


 Key of A - Data of A
 Key of B - Data of B
 Key of C - Data of C

 In order to get Data of A from the result map, all we need to do is perform a 
 resultMap.get(A), but we got problem here: The result map's key is something 
 else, it's not the key we gave before, in the case above, it's no longer a 
 list of A,B,C while the value is exactly the data we need, but it's very 
 troublesome we are unable to find the corresponding data from the key.

 We made a guess that the key ByteBuffers has been changed in the query 
 process due to call by reference, and we found this in the server's source 
 code which looks like that the key has been changed unexpectedly in 
 org.apache.cassandra.thrift.CassandraServer's getSlice method:

 columnFamilies.get(StorageService.getPartitioner().decorateKey(command.key));

 Looks like the key has been decorated for some purpose, and it's has been 
 changed in the process due to the nature of ByteBuffer, and the decorated key 
 has been used as the key in the result map.

 columnFamiliesMap.put(command.key, thriftifiedColumns);

 Are we misinterpreted the Java Doc API or is this is a bug?



Re: Booting Cassandra v0.7.0 on Windows: rename failed

2010-11-29 Thread Gary Dusbabek
Windows is notoriously bad about hanging on to file handles.  Make
sure there are no explorer windows or command line windows open to
d:\cassandra\data\system\, and then hope for the best.

Gary.

On Mon, Nov 29, 2010 at 02:49, Ramon Rockx r.ro...@asknow.nl wrote:
 Hi,

 Recently I downloaded Cassandra v0.7.0 rc1. When I try to run cassandra
 it ends with the following logging:

  INFO 09:17:30,044 Enqueuing flush of
 memtable-locationi...@839514767(643 bytes, 12 operations)
  INFO 09:17:30,045 Writing memtable-locationi...@839514767(643 bytes, 12
 operations)
 ERROR 09:17:30,233 Fatal exception in thread
 Thread[FlushWriter:1,5,main]
 java.io.IOError: java.io.IOException: rename failed of
 d:\cassandra\data\system\LocationInfo-e-1-Data.db
  at
 org.apache.cassandra.io.sstable.SSTableWriter.rename(SSTableWriter.java:
 214)
  at
 org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTable
 Writer.java:184)
  at
 org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTable
 Writer.java:167)
  at
 org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:161)
  at org.apache.cassandra.db.Memtable.access$000(Memtable.java:49)
  at org.apache.cassandra.db.Memtable$1.runMayThrow(Memtable.java:174)
  at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
  at
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
  at java.util.concurrent.FutureTask.run(FutureTask.java:138)
  at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto
 r.java:886)
  at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
 va:908)
  at java.lang.Thread.run(Thread.java:619)
 Caused by: java.io.IOException: rename failed of
 d:\cassandra\data\system\LocationInfo-e-1-Data.db
  at
 org.apache.cassandra.utils.FBUtilities.renameWithConfirm(FBUtilities.jav
 a:359)
  at
 org.apache.cassandra.io.sstable.SSTableWriter.rename(SSTableWriter.java:
 210)
  ... 12 more

 Operating system is Windows 7. Tried it also on Windows 2003 server.
 I only modified a few (necessary) path settings in cassandra.yaml:

 commitlog_directory: d:/cassandra/commitlog
 data_file_directories:
 - d:/cassandra/data
 saved_caches_directory: d:/cassandra/saved_caches

 Does anybody know what I'm doing wrong?

 Regards,
 Ramon



Re: Booting Cassandra v0.7.0 on Windows: rename failed

2010-11-29 Thread Jonathan Ellis
Please report a bug at https://issues.apache.org/jira/browse/CASSANDRA

On Mon, Nov 29, 2010 at 2:49 AM, Ramon Rockx r.ro...@asknow.nl wrote:
 Hi,

 Recently I downloaded Cassandra v0.7.0 rc1. When I try to run cassandra
 it ends with the following logging:

  INFO 09:17:30,044 Enqueuing flush of
 memtable-locationi...@839514767(643 bytes, 12 operations)
  INFO 09:17:30,045 Writing memtable-locationi...@839514767(643 bytes, 12
 operations)
 ERROR 09:17:30,233 Fatal exception in thread
 Thread[FlushWriter:1,5,main]
 java.io.IOError: java.io.IOException: rename failed of
 d:\cassandra\data\system\LocationInfo-e-1-Data.db
  at
 org.apache.cassandra.io.sstable.SSTableWriter.rename(SSTableWriter.java:
 214)
  at
 org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTable
 Writer.java:184)
  at
 org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTable
 Writer.java:167)
  at
 org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:161)
  at org.apache.cassandra.db.Memtable.access$000(Memtable.java:49)
  at org.apache.cassandra.db.Memtable$1.runMayThrow(Memtable.java:174)
  at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
  at
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
  at java.util.concurrent.FutureTask.run(FutureTask.java:138)
  at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto
 r.java:886)
  at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
 va:908)
  at java.lang.Thread.run(Thread.java:619)
 Caused by: java.io.IOException: rename failed of
 d:\cassandra\data\system\LocationInfo-e-1-Data.db
  at
 org.apache.cassandra.utils.FBUtilities.renameWithConfirm(FBUtilities.jav
 a:359)
  at
 org.apache.cassandra.io.sstable.SSTableWriter.rename(SSTableWriter.java:
 210)
  ... 12 more

 Operating system is Windows 7. Tried it also on Windows 2003 server.
 I only modified a few (necessary) path settings in cassandra.yaml:

 commitlog_directory: d:/cassandra/commitlog
 data_file_directories:
 - d:/cassandra/data
 saved_caches_directory: d:/cassandra/saved_caches

 Does anybody know what I'm doing wrong?

 Regards,
 Ramon




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Solr DataImportHandler (DIH) and Cassandra

2010-11-29 Thread Mark

Is there anyway to use DIH to import from Cassandra? Thanks


RE: Booting Cassandra v0.7.0 on Windows: rename failed

2010-11-29 Thread Viktor Jevdokimov
This isn't a first time Cassandra has I/O issues on Windows.

I think it's not easy to review source code and eliminate such issues, but 
would like developers to keep in mind such issues in the future.

We're also running a Cassandra cluster on Windows, but 0.7 beta1 (with similar 
issue, but for Commit Log) and waiting for 0.7 release to use it fully on 
production.


Viktor

-Original Message-
From: Jonathan Ellis [mailto:jbel...@gmail.com] 
Sent: Monday, November 29, 2010 5:09 PM
To: user
Subject: Re: Booting Cassandra v0.7.0 on Windows: rename failed

Please report a bug at https://issues.apache.org/jira/browse/CASSANDRA

On Mon, Nov 29, 2010 at 2:49 AM, Ramon Rockx r.ro...@asknow.nl wrote:
 Hi,

 Recently I downloaded Cassandra v0.7.0 rc1. When I try to run cassandra
 it ends with the following logging:

  INFO 09:17:30,044 Enqueuing flush of
 memtable-locationi...@839514767(643 bytes, 12 operations)
  INFO 09:17:30,045 Writing memtable-locationi...@839514767(643 bytes, 12
 operations)
 ERROR 09:17:30,233 Fatal exception in thread
 Thread[FlushWriter:1,5,main]
 java.io.IOError: java.io.IOException: rename failed of
 d:\cassandra\data\system\LocationInfo-e-1-Data.db
  at
 org.apache.cassandra.io.sstable.SSTableWriter.rename(SSTableWriter.java:
 214)
  at
 org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTable
 Writer.java:184)
  at
 org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTable
 Writer.java:167)
  at
 org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:161)
  at org.apache.cassandra.db.Memtable.access$000(Memtable.java:49)
  at org.apache.cassandra.db.Memtable$1.runMayThrow(Memtable.java:174)
  at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
  at
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
  at java.util.concurrent.FutureTask.run(FutureTask.java:138)
  at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto
 r.java:886)
  at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
 va:908)
  at java.lang.Thread.run(Thread.java:619)
 Caused by: java.io.IOException: rename failed of
 d:\cassandra\data\system\LocationInfo-e-1-Data.db
  at
 org.apache.cassandra.utils.FBUtilities.renameWithConfirm(FBUtilities.jav
 a:359)
  at
 org.apache.cassandra.io.sstable.SSTableWriter.rename(SSTableWriter.java:
 210)
  ... 12 more

 Operating system is Windows 7. Tried it also on Windows 2003 server.
 I only modified a few (necessary) path settings in cassandra.yaml:

 commitlog_directory: d:/cassandra/commitlog
 data_file_directories:
 - d:/cassandra/data
 saved_caches_directory: d:/cassandra/saved_caches

 Does anybody know what I'm doing wrong?

 Regards,
 Ramon




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com




RE: Booting Cassandra v0.7.0 on Windows: rename failed

2010-11-29 Thread Aditya Muralidharan
I've run into this as well. Having confirmed that there are no handles on the 
file (it's only ever created and used by Cassandra), and having stepped through 
the code, I've concluded that something in the io (not sure if it's the jvm or 
the os) stack is lazy about releasing the file handle for 'RandomAccessFile's. 
I was able to get past these issues by setting a breakpoint after the call to 
close (on the file-to-be-renamed), waiting 30 seconds, then resuming the 
thread. Basically, Cassandra won't start on windows 7 in its current state.

AD

-Original Message-
From: Viktor Jevdokimov [mailto:viktor.jevdoki...@adform.com] 
Sent: Monday, November 29, 2010 10:13 AM
To: user@cassandra.apache.org
Subject: RE: Booting Cassandra v0.7.0 on Windows: rename failed

This isn't a first time Cassandra has I/O issues on Windows.

I think it's not easy to review source code and eliminate such issues, but 
would like developers to keep in mind such issues in the future.

We're also running a Cassandra cluster on Windows, but 0.7 beta1 (with similar 
issue, but for Commit Log) and waiting for 0.7 release to use it fully on 
production.


Viktor

-Original Message-
From: Jonathan Ellis [mailto:jbel...@gmail.com] 
Sent: Monday, November 29, 2010 5:09 PM
To: user
Subject: Re: Booting Cassandra v0.7.0 on Windows: rename failed

Please report a bug at https://issues.apache.org/jira/browse/CASSANDRA

On Mon, Nov 29, 2010 at 2:49 AM, Ramon Rockx r.ro...@asknow.nl wrote:
 Hi,

 Recently I downloaded Cassandra v0.7.0 rc1. When I try to run cassandra
 it ends with the following logging:

  INFO 09:17:30,044 Enqueuing flush of
 memtable-locationi...@839514767(643 bytes, 12 operations)
  INFO 09:17:30,045 Writing memtable-locationi...@839514767(643 bytes, 12
 operations)
 ERROR 09:17:30,233 Fatal exception in thread
 Thread[FlushWriter:1,5,main]
 java.io.IOError: java.io.IOException: rename failed of
 d:\cassandra\data\system\LocationInfo-e-1-Data.db
  at
 org.apache.cassandra.io.sstable.SSTableWriter.rename(SSTableWriter.java:
 214)
  at
 org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTable
 Writer.java:184)
  at
 org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTable
 Writer.java:167)
  at
 org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:161)
  at org.apache.cassandra.db.Memtable.access$000(Memtable.java:49)
  at org.apache.cassandra.db.Memtable$1.runMayThrow(Memtable.java:174)
  at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
  at
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
  at java.util.concurrent.FutureTask.run(FutureTask.java:138)
  at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto
 r.java:886)
  at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
 va:908)
  at java.lang.Thread.run(Thread.java:619)
 Caused by: java.io.IOException: rename failed of
 d:\cassandra\data\system\LocationInfo-e-1-Data.db
  at
 org.apache.cassandra.utils.FBUtilities.renameWithConfirm(FBUtilities.jav
 a:359)
  at
 org.apache.cassandra.io.sstable.SSTableWriter.rename(SSTableWriter.java:
 210)
  ... 12 more

 Operating system is Windows 7. Tried it also on Windows 2003 server.
 I only modified a few (necessary) path settings in cassandra.yaml:

 commitlog_directory: d:/cassandra/commitlog
 data_file_directories:
 - d:/cassandra/data
 saved_caches_directory: d:/cassandra/saved_caches

 Does anybody know what I'm doing wrong?

 Regards,
 Ramon




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com




unsubscribe

2010-11-29 Thread Dave Therrien



Re: word_count example fails in multi-node configuration

2010-11-29 Thread Jeremy Hanna
Roman:

I logged a jira ticket about this for further investigation, if you'd like to 
follow that.

https://issues.apache.org/jira/browse/CASSANDRA-1787

On Nov 29, 2010, at 3:14 AM, RS wrote:

 Hi guys,
 
 I am trying to run word_count example from contrib directory (0.7 beta
 3 and 0.7.0 rc 1).
 It works fine in a single-node configuration, but fails with 2+ nodes.
 
 It fails in the assert statement, which caused problems before
 (https://issues.apache.org/jira/browse/CASSANDRA-1700).
 
 Here's a simple ring I have and error messages.
 ---
 Address Status State   LoadOwnsToken
 
 143797990709940316224804537595633718982
 127.0.0.2   Up Normal  40.2 KB 51.38%
 61078635599166706937511052402724559481
 127.0.0.1   Up Normal  36.01 KB48.62%
 143797990709940316224804537595633718982
 ---
 [SERVER SIDE]
 
 ERROR 17:39:57,098 Fatal exception in thread Thread[ReadStage:4,5,main]
 java.lang.AssertionError:
 (143797990709940316224804537595633718982,61078635599166706937511052402724559481]
   at 
 org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1273)
   at 
 org.apache.cassandra.service.RangeSliceVerbHandler.doVerb(RangeSliceVerbHandler.java:48)
   at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:619)
 ---
 [CLIENT_SIDE]
 java.lang.RuntimeException: org.apache.thrift.TApplicationException:
 Internal error processing get_range_slices
   at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:277)
   at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:292)
   at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:189)
   at 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
   at 
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
   at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:148)
   at 
 org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423)
   at 
 org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
   at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
 Caused by: org.apache.thrift.TApplicationException: Internal error
 processing get_range_slices
   at 
 org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
   at 
 org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:724)
   at 
 org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:704)
   at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:255)
   ... 11 more
 ---
 
 Looks like tokens used in ColumnFamilySplits
 (ColumnFamilyInputFormat.java) are on wrapping ranges (left_token 
 right_token).
 Any ideas how to fix this?
 
 --
 Regards,
 Roman



Re: Issues getting JNA to work correctly under centos 5.5 using cassandra 0.7.0-rc1 and JNA 2.7.3

2010-11-29 Thread Nate McCall
What does the current line(s) in limits.conf look like?

On Mon, Nov 29, 2010 at 2:01 AM,  jasonmp...@gmail.com wrote:
 I checked and /etc/security/limits.conf on redhat supports zero (0) to
 mean unlimited.  Here is the sample from the man page.  Notice the
 soft core entry.

 EXAMPLES
       These are some example lines which might be specified in
       /etc/security/limits.conf.

       *               soft    core            0
       *               hard    rss             1
       @student        hard    nproc           20
       @faculty        soft    nproc           20
       @faculty        hard    nproc           50
       ftp             hard    nproc           0
       @student        -       maxlogins       4



 On Mon, Nov 29, 2010 at 6:51 AM, Jason Pell jasonmp...@gmail.com wrote:
 Ok that's a good point i will check - I am not sure.

 Sent from my iPhone
 On Nov 29, 2010, at 5:53, Tyler Hobbs ty...@riptano.com wrote:

 I'm not familiar with ulimit on RedHat systems, but are you sure you
 have ulimit set correctly? Did you set it to '0' or 'unlimited'?  I ask
 because on a Debian system, I get this:

 tho...@~ $ ulimit -l
 unlimited

 Where you said that you got back '0'.

 - Tyler

 On Sun, Nov 28, 2010 at 1:15 AM, Jason Pell ja...@pellcorp.com wrote:

 Hi,

 I have selinux disabled via /etc/sysconfig/selinux already.  But I did
 as you suggested anyway, even restarted the whole machine again too
 and still no difference.  Do you know if there is a way to discover
 exactly what this error means?

 THanks
 Jason

 On Sat, Nov 27, 2010 at 3:59 AM, Nate McCall n...@riptano.com wrote:
  This might be an issue with selinux. You can try this quickly to
  temporarily disable selinux enforcement:
  /usr/sbin/setenforce 0  (as root)
 
  and then start cassandra as your user.
 
  On Fri, Nov 26, 2010 at 1:00 AM, Jason Pell jasonmp...@gmail.com
  wrote:
  I restarted the box :-) so it's well and truly set
 
  Sent from my iPhone
  On Nov 26, 2010, at 17:57, Brandon Williams dri...@gmail.com wrote:
 
  On Thu, Nov 25, 2010 at 10:02 PM, Jason Pell ja...@pellcorp.com
  wrote:
 
  Hi,
 
  I have set the memlock limit to unlimited in /etc/security/limits.conf
 
  [devel...@localhost apache-cassandra-0.7.0-rc1]$ ulimit -l
  0
 
  Running as a non root user gets me a Unknown mlockall error 1
 
  Have you tried logging out and back in after changing limits.conf?
  -Brandon
 





Re: word_count example fails in multi-node configuration

2010-11-29 Thread Jeremy Hanna
Roman:

I commented on the ticket - would you mind answering on there?  
https://issues.apache.org/jira/browse/CASSANDRA-1787

Tx,

Jeremy

On Nov 29, 2010, at 3:14 AM, RS wrote:

 Hi guys,
 
 I am trying to run word_count example from contrib directory (0.7 beta
 3 and 0.7.0 rc 1).
 It works fine in a single-node configuration, but fails with 2+ nodes.
 
 It fails in the assert statement, which caused problems before
 (https://issues.apache.org/jira/browse/CASSANDRA-1700).
 
 Here's a simple ring I have and error messages.
 ---
 Address Status State   LoadOwnsToken
 
 143797990709940316224804537595633718982
 127.0.0.2   Up Normal  40.2 KB 51.38%
 61078635599166706937511052402724559481
 127.0.0.1   Up Normal  36.01 KB48.62%
 143797990709940316224804537595633718982
 ---
 [SERVER SIDE]
 
 ERROR 17:39:57,098 Fatal exception in thread Thread[ReadStage:4,5,main]
 java.lang.AssertionError:
 (143797990709940316224804537595633718982,61078635599166706937511052402724559481]
   at 
 org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1273)
   at 
 org.apache.cassandra.service.RangeSliceVerbHandler.doVerb(RangeSliceVerbHandler.java:48)
   at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:619)
 ---
 [CLIENT_SIDE]
 java.lang.RuntimeException: org.apache.thrift.TApplicationException:
 Internal error processing get_range_slices
   at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:277)
   at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:292)
   at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:189)
   at 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
   at 
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
   at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:148)
   at 
 org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423)
   at 
 org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
   at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
 Caused by: org.apache.thrift.TApplicationException: Internal error
 processing get_range_slices
   at 
 org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
   at 
 org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:724)
   at 
 org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:704)
   at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:255)
   ... 11 more
 ---
 
 Looks like tokens used in ColumnFamilySplits
 (ColumnFamilyInputFormat.java) are on wrapping ranges (left_token 
 right_token).
 Any ideas how to fix this?
 
 --
 Regards,
 Roman



Re: word_count example fails in multi-node configuration

2010-11-29 Thread Jeremy Hanna
So final answer - known issue with RC1 - 
https://issues.apache.org/jira/browse/CASSANDRA-1781 - that should be fixed 
before 0.7.0 is completed.

On Nov 29, 2010, at 11:31 AM, Jeremy Hanna wrote:

 Roman:
 
 I logged a jira ticket about this for further investigation, if you'd like to 
 follow that.
 
 https://issues.apache.org/jira/browse/CASSANDRA-1787
 
 On Nov 29, 2010, at 3:14 AM, RS wrote:
 
 Hi guys,
 
 I am trying to run word_count example from contrib directory (0.7 beta
 3 and 0.7.0 rc 1).
 It works fine in a single-node configuration, but fails with 2+ nodes.
 
 It fails in the assert statement, which caused problems before
 (https://issues.apache.org/jira/browse/CASSANDRA-1700).
 
 Here's a simple ring I have and error messages.
 ---
 Address Status State   LoadOwnsToken
 
 143797990709940316224804537595633718982
 127.0.0.2   Up Normal  40.2 KB 51.38%
 61078635599166706937511052402724559481
 127.0.0.1   Up Normal  36.01 KB48.62%
 143797990709940316224804537595633718982
 ---
 [SERVER SIDE]
 
 ERROR 17:39:57,098 Fatal exception in thread Thread[ReadStage:4,5,main]
 java.lang.AssertionError:
 (143797990709940316224804537595633718982,61078635599166706937511052402724559481]
  at 
 org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1273)
  at 
 org.apache.cassandra.service.RangeSliceVerbHandler.doVerb(RangeSliceVerbHandler.java:48)
  at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62)
  at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
  at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
  at java.lang.Thread.run(Thread.java:619)
 ---
 [CLIENT_SIDE]
 java.lang.RuntimeException: org.apache.thrift.TApplicationException:
 Internal error processing get_range_slices
  at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:277)
  at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:292)
  at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:189)
  at 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
  at 
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
  at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:148)
  at 
 org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423)
  at 
 org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
  at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
  at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
 Caused by: org.apache.thrift.TApplicationException: Internal error
 processing get_range_slices
  at 
 org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
  at 
 org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:724)
  at 
 org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:704)
  at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:255)
  ... 11 more
 ---
 
 Looks like tokens used in ColumnFamilySplits
 (ColumnFamilyInputFormat.java) are on wrapping ranges (left_token 
 right_token).
 Any ideas how to fix this?
 
 --
 Regards,
 Roman
 



Re: Issues getting JNA to work correctly under centos 5.5 using cassandra 0.7.0-rc1 and JNA 2.7.3

2010-11-29 Thread Jason Pell
*   -   memlock 0


On Tue, Nov 30, 2010 at 4:40 AM, Nate McCall n...@riptano.com wrote:
 What does the current line(s) in limits.conf look like?

 On Mon, Nov 29, 2010 at 2:01 AM,  jasonmp...@gmail.com wrote:
 I checked and /etc/security/limits.conf on redhat supports zero (0) to
 mean unlimited.  Here is the sample from the man page.  Notice the
 soft core entry.

 EXAMPLES
       These are some example lines which might be specified in
       /etc/security/limits.conf.

       *               soft    core            0
       *               hard    rss             1
       @student        hard    nproc           20
       @faculty        soft    nproc           20
       @faculty        hard    nproc           50
       ftp             hard    nproc           0
       @student        -       maxlogins       4



 On Mon, Nov 29, 2010 at 6:51 AM, Jason Pell jasonmp...@gmail.com wrote:
 Ok that's a good point i will check - I am not sure.

 Sent from my iPhone
 On Nov 29, 2010, at 5:53, Tyler Hobbs ty...@riptano.com wrote:

 I'm not familiar with ulimit on RedHat systems, but are you sure you
 have ulimit set correctly? Did you set it to '0' or 'unlimited'?  I ask
 because on a Debian system, I get this:

 tho...@~ $ ulimit -l
 unlimited

 Where you said that you got back '0'.

 - Tyler

 On Sun, Nov 28, 2010 at 1:15 AM, Jason Pell ja...@pellcorp.com wrote:

 Hi,

 I have selinux disabled via /etc/sysconfig/selinux already.  But I did
 as you suggested anyway, even restarted the whole machine again too
 and still no difference.  Do you know if there is a way to discover
 exactly what this error means?

 THanks
 Jason

 On Sat, Nov 27, 2010 at 3:59 AM, Nate McCall n...@riptano.com wrote:
  This might be an issue with selinux. You can try this quickly to
  temporarily disable selinux enforcement:
  /usr/sbin/setenforce 0  (as root)
 
  and then start cassandra as your user.
 
  On Fri, Nov 26, 2010 at 1:00 AM, Jason Pell jasonmp...@gmail.com
  wrote:
  I restarted the box :-) so it's well and truly set
 
  Sent from my iPhone
  On Nov 26, 2010, at 17:57, Brandon Williams dri...@gmail.com wrote:
 
  On Thu, Nov 25, 2010 at 10:02 PM, Jason Pell ja...@pellcorp.com
  wrote:
 
  Hi,
 
  I have set the memlock limit to unlimited in /etc/security/limits.conf
 
  [devel...@localhost apache-cassandra-0.7.0-rc1]$ ulimit -l
  0
 
  Running as a non root user gets me a Unknown mlockall error 1
 
  Have you tried logging out and back in after changing limits.conf?
  -Brandon
 






Re: Issues getting JNA to work correctly under centos 5.5 using cassandra 0.7.0-rc1 and JNA 2.7.3

2010-11-29 Thread Nate McCall
Ok, I was able to reproduce this with 0 as the value. Changing it to
unlimited will make this go away. A closer reading of the
limits.conf man page seems to leave some ambiguity when taken with the
examples:
All items support the values -1, unlimited or infinity indicating no
limit, except for priority and nice.

I would recommend tightening this to a specific user. The line I ended
up with for the cassandra user was:

cassandra-   memlock   unlimited

You probably want to add a line for nofile in there at ~ 16384 as well
while your there as that can be an issue depending on load.



On Mon, Nov 29, 2010 at 1:59 PM, Jason Pell ja...@pellcorp.com wrote:
 *               -       memlock         0


 On Tue, Nov 30, 2010 at 4:40 AM, Nate McCall n...@riptano.com wrote:
 What does the current line(s) in limits.conf look like?

 On Mon, Nov 29, 2010 at 2:01 AM,  jasonmp...@gmail.com wrote:
 I checked and /etc/security/limits.conf on redhat supports zero (0) to
 mean unlimited.  Here is the sample from the man page.  Notice the
 soft core entry.

 EXAMPLES
       These are some example lines which might be specified in
       /etc/security/limits.conf.

       *               soft    core            0
       *               hard    rss             1
       @student        hard    nproc           20
       @faculty        soft    nproc           20
       @faculty        hard    nproc           50
       ftp             hard    nproc           0
       @student        -       maxlogins       4



 On Mon, Nov 29, 2010 at 6:51 AM, Jason Pell jasonmp...@gmail.com wrote:
 Ok that's a good point i will check - I am not sure.

 Sent from my iPhone
 On Nov 29, 2010, at 5:53, Tyler Hobbs ty...@riptano.com wrote:

 I'm not familiar with ulimit on RedHat systems, but are you sure you
 have ulimit set correctly? Did you set it to '0' or 'unlimited'?  I ask
 because on a Debian system, I get this:

 tho...@~ $ ulimit -l
 unlimited

 Where you said that you got back '0'.

 - Tyler

 On Sun, Nov 28, 2010 at 1:15 AM, Jason Pell ja...@pellcorp.com wrote:

 Hi,

 I have selinux disabled via /etc/sysconfig/selinux already.  But I did
 as you suggested anyway, even restarted the whole machine again too
 and still no difference.  Do you know if there is a way to discover
 exactly what this error means?

 THanks
 Jason

 On Sat, Nov 27, 2010 at 3:59 AM, Nate McCall n...@riptano.com wrote:
  This might be an issue with selinux. You can try this quickly to
  temporarily disable selinux enforcement:
  /usr/sbin/setenforce 0  (as root)
 
  and then start cassandra as your user.
 
  On Fri, Nov 26, 2010 at 1:00 AM, Jason Pell jasonmp...@gmail.com
  wrote:
  I restarted the box :-) so it's well and truly set
 
  Sent from my iPhone
  On Nov 26, 2010, at 17:57, Brandon Williams dri...@gmail.com wrote:
 
  On Thu, Nov 25, 2010 at 10:02 PM, Jason Pell ja...@pellcorp.com
  wrote:
 
  Hi,
 
  I have set the memlock limit to unlimited in /etc/security/limits.conf
 
  [devel...@localhost apache-cassandra-0.7.0-rc1]$ ulimit -l
  0
 
  Running as a non root user gets me a Unknown mlockall error 1
 
  Have you tried logging out and back in after changing limits.conf?
  -Brandon
 







Re: Issues getting JNA to work correctly under centos 5.5 using cassandra 0.7.0-rc1 and JNA 2.7.3

2010-11-29 Thread Jason Pell
Awesome thanks will make the changes

So is the man page inaccurate? Or is jna doing something wrong? 

Sent from my iPhone

On Nov 30, 2010, at 7:28, Nate McCall n...@riptano.com wrote:

 Ok, I was able to reproduce this with 0 as the value. Changing it to
 unlimited will make this go away. A closer reading of the
 limits.conf man page seems to leave some ambiguity when taken with the
 examples:
 All items support the values -1, unlimited or infinity indicating no
 limit, except for priority and nice.
 
 I would recommend tightening this to a specific user. The line I ended
 up with for the cassandra user was:
 
 cassandra-   memlock   unlimited
 
 You probably want to add a line for nofile in there at ~ 16384 as well
 while your there as that can be an issue depending on load.
 
 
 
 On Mon, Nov 29, 2010 at 1:59 PM, Jason Pell ja...@pellcorp.com wrote:
 *   -   memlock 0
 
 
 On Tue, Nov 30, 2010 at 4:40 AM, Nate McCall n...@riptano.com wrote:
 What does the current line(s) in limits.conf look like?
 
 On Mon, Nov 29, 2010 at 2:01 AM,  jasonmp...@gmail.com wrote:
 I checked and /etc/security/limits.conf on redhat supports zero (0) to
 mean unlimited.  Here is the sample from the man page.  Notice the
 soft core entry.
 
 EXAMPLES
   These are some example lines which might be specified in
   /etc/security/limits.conf.
 
   *   softcore0
   *   hardrss 1
   @studenthardnproc   20
   @facultysoftnproc   20
   @facultyhardnproc   50
   ftp hardnproc   0
   @student-   maxlogins   4
 
 
 
 On Mon, Nov 29, 2010 at 6:51 AM, Jason Pell jasonmp...@gmail.com wrote:
 Ok that's a good point i will check - I am not sure.
 
 Sent from my iPhone
 On Nov 29, 2010, at 5:53, Tyler Hobbs ty...@riptano.com wrote:
 
 I'm not familiar with ulimit on RedHat systems, but are you sure you
 have ulimit set correctly? Did you set it to '0' or 'unlimited'?  I ask
 because on a Debian system, I get this:
 
 tho...@~ $ ulimit -l
 unlimited
 
 Where you said that you got back '0'.
 
 - Tyler
 
 On Sun, Nov 28, 2010 at 1:15 AM, Jason Pell ja...@pellcorp.com wrote:
 
 Hi,
 
 I have selinux disabled via /etc/sysconfig/selinux already.  But I did
 as you suggested anyway, even restarted the whole machine again too
 and still no difference.  Do you know if there is a way to discover
 exactly what this error means?
 
 THanks
 Jason
 
 On Sat, Nov 27, 2010 at 3:59 AM, Nate McCall n...@riptano.com wrote:
 This might be an issue with selinux. You can try this quickly to
 temporarily disable selinux enforcement:
 /usr/sbin/setenforce 0  (as root)
 
 and then start cassandra as your user.
 
 On Fri, Nov 26, 2010 at 1:00 AM, Jason Pell jasonmp...@gmail.com
 wrote:
 I restarted the box :-) so it's well and truly set
 
 Sent from my iPhone
 On Nov 26, 2010, at 17:57, Brandon Williams dri...@gmail.com wrote:
 
 On Thu, Nov 25, 2010 at 10:02 PM, Jason Pell ja...@pellcorp.com
 wrote:
 
 Hi,
 
 I have set the memlock limit to unlimited in /etc/security/limits.conf
 
 [devel...@localhost apache-cassandra-0.7.0-rc1]$ ulimit -l
 0
 
 Running as a non root user gets me a Unknown mlockall error 1
 
 Have you tried logging out and back in after changing limits.conf?
 -Brandon
 
 
 
 
 
 


Re: Cassandra 0.7 beta 3 outOfMemory (OOM)

2010-11-29 Thread Aaron Morton
Sounds like you need to increase the Heap size and/or reduce the memtable_throughput_in_mb and/or turn off the internal caches. Normally the binary memtable thresholds only apply to bulk load operations and it's the per CF memtable_* settings you want to change. I'm notfamiliarwith lucandra though.See the section on JVM Heap Size herehttp://wiki.apache.org/cassandra/MemtableThresholdsBottom line is you will need more JVM heap memory.Hope that helps.AaronOn 29 Nov, 2010,at 10:28 PM, cassan...@ajowa.de wrote:Hi community,

during my tests i had several OOM crashes.
Getting some hints to find the problem would be nice.

First cassandra crashes after about 45 min insert test script
During the following tests time to OOM was shorter until it started to crash
even in "idle" mode.

Here the facts:
- cassandra 0.7 beta 3
- using lucandra to index about 3 million files ~1kb data
- inserting with one client to one cassandra node with about 200 files/s
- cassandra data files for this keyspace grow up to about 20 GB
- the keyspace only contains the two lucandra specific CFs

Cluster:
- cassandra single node on windows 32bit, Xeon 2,5 Ghz, 4GB RAM
- java jre 1.6.0_22
- heap space first 1GB, later increased to 1,3 GB

Cassandra.yaml:
default + reduced "binary_memtable_throughput_in_mb" to 128

CFs:
default + reduced
min_compaction_threshold: 4
max_compaction_threshold: 8


I think the problem appears always during compaction,
and perhaps it is a result of large rows (some about 170mb).

Are there more optionions we could use to work with few memory?

Is it a problem of compaction?
And how to avoid?
Slower inserts? More memory?
Even fewer memtable_throuput or in_memory_compaction_limit?
Continuous manual major comapction?

I've read  
http://www.riptano.com/docs/0.6/troubleshooting/index#nodes-are-dying-with-oom-errors
- row_size should be fixed since 0.7 and 200mb is still far away from 2gb
- only key cache is used a little bit 3600/2
- after a lot of writes cassandra crashes even in idle mode
- memtablesize was reduced and there are only 2 CFs

Several heapdumps in MAT show 60-99% heapusage of compaction thread.

Here some log extract:

  INFO [CompactionExecutor:1] 2010-11-26 14:18:18,593  
CompactionIterator.java (line 134) Compacting large row  
6650325572717566efbfbf44545241434b53efbfbf31 (172967291 bytes)  
incrementally
  INFO [ScheduledTasks:1] 2010-11-26 14:18:41,421 GCInspector.java  
(line 133) GC for ParNew: 365 ms, 54551328 reclaimed leaving 459496840  
used; max is 1450442752
  INFO [ScheduledTasks:1] 2010-11-26 14:18:42,437 GCInspector.java  
(line 133) GC for ParNew: 226 ms, 12469104 reclaimed leaving 554506776  
used; max is 1450442752
  INFO [ScheduledTasks:1] 2010-11-26 14:18:43,453 GCInspector.java  
(line 133) GC for ParNew: 224 ms, 12777840 reclaimed leaving 649207976  
used; max is 1450442752
  INFO [ScheduledTasks:1] 2010-11-26 14:18:44,468 GCInspector.java  
(line 133) GC for ParNew: 225 ms, 12564144 reclaimed leaving 744122872  
used; max is 1450442752
  INFO [ScheduledTasks:1] 2010-11-26 14:18:45,468 GCInspector.java  
(line 133) GC for ParNew: 222 ms, 16020328 reclaimed leaving 835581584  
used; max is 1450442752
  INFO [ScheduledTasks:1] 2010-11-26 14:18:46,468 GCInspector.java  
(line 133) GC for ParNew: 226 ms, 12697912 reclaimed leaving 930362712  
used; max is 1450442752
  INFO [ScheduledTasks:1] 2010-11-26 14:18:47,468 GCInspector.java  
(line 133) GC for ParNew: 227 ms, 15816872 reclaimed leaving  
1022026288 used; max is 1450442752
  INFO [ScheduledTasks:1] 2010-11-26 14:18:48,484 GCInspector.java  
(line 133) GC for ParNew: 258 ms, 12746584 reclaimed leaving  
1116758744 used; max is 1450442752
  INFO [ScheduledTasks:1] 2010-11-26 14:18:49,484 GCInspector.java  
(line 133) GC for ParNew: 257 ms, 12802608 reclaimed leaving  
1211435176 used; max is 1450442752
  INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java  
(line 133) GC for ConcurrentMarkSweep: 4188 ms, 271308512 reclaimed  
leaving 1047605704 used; max is 1450442752
  INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java  
(line 153) Pool NameActive   Pending
  INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java  
(line 160) ResponseStage 0 0
  INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java  
(line 160) ReadStage 0 0
  INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspectorjava  
(line 160) ReadRepair0 0
  INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java  
(line 160) MutationStage 0 0
  INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java  
(line 160) GossipStage   0 0
  INFO [ScheduledTasks:1] 2010-11-26 14:18:54,546 GCInspector.java  
(line 160) AntientropyStage  0 0
  INFO [ScheduledTasks:1] 2010-11-26 14:18:54,562 GCInspector.java  
(line 160) 

Introduction to Cassandra

2010-11-29 Thread Aaron Morton
I did a talk last week at the Wellington Rails User Group as a basic introduction to Cassandra. The slides are herehttp://www.slideshare.net/aaronmorton/well-railedcassandra24112010-5901169if anyone is interested.CheersAaron

Re: Introduction to Cassandra

2010-11-29 Thread Jonathan Ellis
That is a lot of slides. :)  Nice work!

On Mon, Nov 29, 2010 at 3:11 PM, Aaron Morton aa...@thelastpickle.com wrote:
 I did a talk last week at the Wellington Rails User Group as a basic
 introduction to Cassandra. The slides are
 here http://www.slideshare.net/aaronmorton/well-railedcassandra24112010-5901169 if
 anyone is interested.
 Cheers
 Aaron




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: Issues getting JNA to work correctly under centos 5.5 using cassandra 0.7.0-rc1 and JNA 2.7.3

2010-11-29 Thread jasonmpell
Hi,

Thanks for that your suggestions worked a treat.  I created a new
cassandra user and set the value to unlimited
and I get the desired log:

INFO 08:49:50,204 JNA mlockall successful



On Tue, Nov 30, 2010 at 7:56 AM, Jason Pell jasonmp...@gmail.com wrote:
 Awesome thanks will make the changes

 So is the man page inaccurate? Or is jna doing something wrong?

 Sent from my iPhone

 On Nov 30, 2010, at 7:28, Nate McCall n...@riptano.com wrote:

 Ok, I was able to reproduce this with 0 as the value. Changing it to
 unlimited will make this go away. A closer reading of the
 limits.conf man page seems to leave some ambiguity when taken with the
 examples:
 All items support the values -1, unlimited or infinity indicating no
 limit, except for priority and nice.

 I would recommend tightening this to a specific user. The line I ended
 up with for the cassandra user was:

 cassandra        -       memlock       unlimited

 You probably want to add a line for nofile in there at ~ 16384 as well
 while your there as that can be an issue depending on load.



 On Mon, Nov 29, 2010 at 1:59 PM, Jason Pell ja...@pellcorp.com wrote:
 *               -       memlock         0


 On Tue, Nov 30, 2010 at 4:40 AM, Nate McCall n...@riptano.com wrote:
 What does the current line(s) in limits.conf look like?

 On Mon, Nov 29, 2010 at 2:01 AM,  jasonmp...@gmail.com wrote:
 I checked and /etc/security/limits.conf on redhat supports zero (0) to
 mean unlimited.  Here is the sample from the man page.  Notice the
 soft core entry.

 EXAMPLES
       These are some example lines which might be specified in
       /etc/security/limits.conf.

       *               soft    core            0
       *               hard    rss             1
       @student        hard    nproc           20
       @faculty        soft    nproc           20
       @faculty        hard    nproc           50
       ftp             hard    nproc           0
       @student        -       maxlogins       4



 On Mon, Nov 29, 2010 at 6:51 AM, Jason Pell jasonmp...@gmail.com wrote:
 Ok that's a good point i will check - I am not sure.

 Sent from my iPhone
 On Nov 29, 2010, at 5:53, Tyler Hobbs ty...@riptano.com wrote:

 I'm not familiar with ulimit on RedHat systems, but are you sure you
 have ulimit set correctly? Did you set it to '0' or 'unlimited'?  I ask
 because on a Debian system, I get this:

 tho...@~ $ ulimit -l
 unlimited

 Where you said that you got back '0'.

 - Tyler

 On Sun, Nov 28, 2010 at 1:15 AM, Jason Pell ja...@pellcorp.com wrote:

 Hi,

 I have selinux disabled via /etc/sysconfig/selinux already.  But I did
 as you suggested anyway, even restarted the whole machine again too
 and still no difference.  Do you know if there is a way to discover
 exactly what this error means?

 THanks
 Jason

 On Sat, Nov 27, 2010 at 3:59 AM, Nate McCall n...@riptano.com wrote:
 This might be an issue with selinux. You can try this quickly to
 temporarily disable selinux enforcement:
 /usr/sbin/setenforce 0  (as root)

 and then start cassandra as your user.

 On Fri, Nov 26, 2010 at 1:00 AM, Jason Pell jasonmp...@gmail.com
 wrote:
 I restarted the box :-) so it's well and truly set

 Sent from my iPhone
 On Nov 26, 2010, at 17:57, Brandon Williams dri...@gmail.com wrote:

 On Thu, Nov 25, 2010 at 10:02 PM, Jason Pell ja...@pellcorp.com
 wrote:

 Hi,

 I have set the memlock limit to unlimited in 
 /etc/security/limits.conf

 [devel...@localhost apache-cassandra-0.7.0-rc1]$ ulimit -l
 0

 Running as a non root user gets me a Unknown mlockall error 1

 Have you tried logging out and back in after changing limits.conf?
 -Brandon









batch_mutate vs number of write operations on CF

2010-11-29 Thread Narendra Sharma
Hi,

I am using Cassandra 0.7 beta3 and Hector.

I create a mutation map. The mutation involves adding few columns for a
given row. After that I use batch_mutate API to send the changes to
Cassandra.

Question:
If there are multiple column writes on same row in a mutation_map, does
Cassandra show (on JMX write count stats for CF) that as 1 write operation
or as N write operations where N is the number of entries in mutation map
for that row.
Assume all the changes in mutation map are for one row.

Thanks,
Naren


Re: Solr DataImportHandler (DIH) and Cassandra

2010-11-29 Thread Mark
The DataSource subclass route is what I will probably be interested in. 
Are there are working examples of this already out there?


On 11/29/10 12:32 PM, Aaron Morton wrote:

AFAIK there is nothing pre-written to pull the data out for you.

You should be able to create your DataSource sub class 
http://lucene.apache.org/solr/api/org/apache/solr/handler/dataimport/DataSource.html Using 
the Hector java library to pull data from Cassandra.


I'm guessing you will need to consider how to perform delta imports. 
Perhaps using the secondary indexes in 0.7* , or maintaining your own 
queues or indexes to know what has changed.


There is also the Lucandra project, not exactly what your after but 
may be of interest anyway https://github.com/tjake/Lucandra


Hope that helps.
Aaron


On 30 Nov, 2010,at 05:04 AM, Mark static.void@gmail.com wrote:


Is there anyway to use DIH to import from Cassandra? Thanks


Re: batch_mutate vs number of write operations on CF

2010-11-29 Thread Tyler Hobbs
Using batch_mutate on a single row will count as 1 write operation, even if
you mutate multiple columns. Using batch_mutate on N rows will count as N
write operations.
- Tyler

On Mon, Nov 29, 2010 at 5:58 PM, Narendra Sharma
narendra.sha...@gmail.comwrote:

 Hi,

 I am using Cassandra 0.7 beta3 and Hector.

 I create a mutation map. The mutation involves adding few columns for a
 given row. After that I use batch_mutate API to send the changes to
 Cassandra.

 Question:
 If there are multiple column writes on same row in a mutation_map, does
 Cassandra show (on JMX write count stats for CF) that as 1 write operation
 or as N write operations where N is the number of entries in mutation map
 for that row.
 Assume all the changes in mutation map are for one row.

 Thanks,
 Naren



Re: word_count example fails in multi-node configuration

2010-11-29 Thread RS
It occurs in 0.7 beta 3 and 0.7.0 rc 1.

Thank you, Jeremy. I will follow the ticket.

-Roman


On Tue, Nov 30, 2010 at 2:50 AM, Jeremy Hanna
jeremy.hanna1...@gmail.com wrote:
 Roman:

 I commented on the ticket - would you mind answering on there?  
 https://issues.apache.org/jira/browse/CASSANDRA-1787

 Tx,

 Jeremy

 On Nov 29, 2010, at 3:14 AM, RS wrote:

 Hi guys,

 I am trying to run word_count example from contrib directory (0.7 beta
 3 and 0.7.0 rc 1).
 It works fine in a single-node configuration, but fails with 2+ nodes.

 It fails in the assert statement, which caused problems before
 (https://issues.apache.org/jira/browse/CASSANDRA-1700).

 Here's a simple ring I have and error messages.
 ---
 Address         Status State   Load            Owns    Token

 143797990709940316224804537595633718982
 127.0.0.2       Up     Normal  40.2 KB         51.38%
 61078635599166706937511052402724559481
 127.0.0.1       Up     Normal  36.01 KB        48.62%
 143797990709940316224804537595633718982
 ---
 [SERVER SIDE]

 ERROR 17:39:57,098 Fatal exception in thread Thread[ReadStage:4,5,main]
 java.lang.AssertionError:
 (143797990709940316224804537595633718982,61078635599166706937511052402724559481]
       at 
 org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1273)
       at 
 org.apache.cassandra.service.RangeSliceVerbHandler.doVerb(RangeSliceVerbHandler.java:48)
       at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62)
       at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
       at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
       at java.lang.Thread.run(Thread.java:619)
 ---
 [CLIENT_SIDE]
 java.lang.RuntimeException: org.apache.thrift.TApplicationException:
 Internal error processing get_range_slices
       at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:277)
       at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:292)
       at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:189)
       at 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
       at 
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
       at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:148)
       at 
 org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423)
       at 
 org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
       at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
 Caused by: org.apache.thrift.TApplicationException: Internal error
 processing get_range_slices
       at 
 org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
       at 
 org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:724)
       at 
 org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:704)
       at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:255)
       ... 11 more
 ---

 Looks like tokens used in ColumnFamilySplits
 (ColumnFamilyInputFormat.java) are on wrapping ranges (left_token 
 right_token).
 Any ideas how to fix this?

 --
 Regards,
 Roman




Re: Re: word_count example fails in multi-node configuration

2010-11-29 Thread Bingbing Liu
try the OrderPreservingPartitioner


2010-11-30 



Bingbing Liu 



发件人: RS 
发送时间: 2010-11-30  09:14:38 
收件人: user 
抄送: 
主题: Re: word_count example fails in multi-node configuration 
 
It occurs in 0.7 beta 3 and 0.7.0 rc 1.
Thank you, Jeremy. I will follow the ticket.
-Roman
On Tue, Nov 30, 2010 at 2:50 AM, Jeremy Hanna
jeremy.hanna1...@gmail.com wrote:
 Roman:

 I commented on the ticket - would you mind answering on there?  
 https://issues.apache.org/jira/browse/CASSANDRA-1787

 Tx,

 Jeremy

 On Nov 29, 2010, at 3:14 AM, RS wrote:

 Hi guys,

 I am trying to run word_count example from contrib directory (0.7 beta
 3 and 0.7.0 rc 1).
 It works fine in a single-node configuration, but fails with 2+ nodes.

 It fails in the assert statement, which caused problems before
 (https://issues.apache.org/jira/browse/CASSANDRA-1700).

 Here's a simple ring I have and error messages.
 ---
 Address Status State   LoadOwnsToken

 143797990709940316224804537595633718982
 127.0.0.2   Up Normal  40.2 KB 51.38%
 61078635599166706937511052402724559481
 127.0.0.1   Up Normal  36.01 KB48.62%
 143797990709940316224804537595633718982
 ---
 [SERVER SIDE]

 ERROR 17:39:57,098 Fatal exception in thread Thread[ReadStage:4,5,main]
 java.lang.AssertionError:
 (143797990709940316224804537595633718982,61078635599166706937511052402724559481]
   at 
 org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1273)
   at 
 org.apache.cassandra.service.RangeSliceVerbHandler.doVerb(RangeSliceVerbHandler.java:48)
   at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:619)
 ---
 [CLIENT_SIDE]
 java.lang.RuntimeException: org.apache.thrift.TApplicationException:
 Internal error processing get_range_slices
   at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:277)
   at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:292)
   at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:189)
   at 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
   at 
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
   at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:148)
   at 
 org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423)
   at 
 org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
   at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
 Caused by: org.apache.thrift.TApplicationException: Internal error
 processing get_range_slices
   at 
 org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
   at 
 org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:724)
   at 
 org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:704)
   at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:255)
   ... 11 more
 ---

 Looks like tokens used in ColumnFamilySplits
 (ColumnFamilyInputFormat.java) are on wrapping ranges (left_token 
 right_token).
 Any ideas how to fix this?

 --
 Regards,
 Roman




Cassandra 0.7 - documentation on Secondary Indexes

2010-11-29 Thread Narendra Sharma
Is there any documentation available on what is possible with secondary
indexes? For eg
- Is it possible to define secondary index on columns within a SuperColumn?
- If I define a secondary index at run time, does Cassandra index all the
existing data or only new data is indexed?

Some documentation along with examples will be highly useful.

Thanks,
Naren


Re: Cassandra 0.7 - documentation on Secondary Indexes

2010-11-29 Thread Jonathan Ellis
On Mon, Nov 29, 2010 at 7:59 PM, Narendra Sharma
narendra.sha...@gmail.com wrote:
 Is there any documentation available on what is possible with secondary
 indexes?

Not yet.

 - Is it possible to define secondary index on columns within a SuperColumn?

No.

 - If I define a secondary index at run time, does Cassandra index all the
 existing data or only new data is indexed?

The former.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: Cassandra 0.7 - documentation on Secondary Indexes

2010-11-29 Thread Narendra Sharma
Thanks Jonathan.

Couple of more questions:
1. Is there any technical limit on the number of secondary indexes that can
be created?

2. Is it possible to execute join queries spanning multiple secondary
indexes?

Thanks,
Naren

On Mon, Nov 29, 2010 at 6:02 PM, Jonathan Ellis jbel...@gmail.com wrote:

 On Mon, Nov 29, 2010 at 7:59 PM, Narendra Sharma
 narendra.sha...@gmail.com wrote:
  Is there any documentation available on what is possible with secondary
  indexes?

 Not yet.

  - Is it possible to define secondary index on columns within a
 SuperColumn?

 No.

  - If I define a secondary index at run time, does Cassandra index all the
  existing data or only new data is indexed?

 The former.

 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of Riptano, the source for professional Cassandra support
 http://riptano.com



Re: Cassandra 0.7 - documentation on Secondary Indexes

2010-11-29 Thread Jonathan Ellis
On Mon, Nov 29, 2010 at 11:26 PM, Narendra Sharma
narendra.sha...@gmail.com wrote:
 Thanks Jonathan.

 Couple of more questions:
 1. Is there any technical limit on the number of secondary indexes that can
 be created?

Just as with traditional databases, the more indexes there are the
slower writes to that CF will be.

 2. Is it possible to execute join queries spanning multiple secondary
 indexes?

What do secondary indexes have to do with joins?

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: Cassandra 0.7 - documentation on Secondary Indexes

2010-11-29 Thread Narendra Sharma
On Mon, Nov 29, 2010 at 9:32 PM, Jonathan Ellis jbel...@gmail.com wrote:

 On Mon, Nov 29, 2010 at 11:26 PM, Narendra Sharma
 narendra.sha...@gmail.com wrote:
  Thanks Jonathan.
 
  Couple of more questions:
  1. Is there any technical limit on the number of secondary indexes that
 can
  be created?

 Just as with traditional databases, the more indexes there are the
 slower writes to that CF will be.

  2. Is it possible to execute join queries spanning multiple secondary
  indexes?

 What do secondary indexes have to do with joins?


For eg if I want to get all employees that are male and have age = 35 years.
How can secondary indexes be useful in such scenario?


 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of Riptano, the source for professional Cassandra support
 http://riptano.com



Re: Cassandra 0.7 - documentation on Secondary Indexes

2010-11-29 Thread Tyler Hobbs
The 'employees with age = 35' scenario is exactly what they are useful for.

There's a quick section in the pycassa documentation that might be useful:

http://pycassa.github.com/pycassa/tutorial.html#indexes

On Mon, Nov 29, 2010 at 11:41 PM, Narendra Sharma narendra.sha...@gmail.com
 wrote:



 On Mon, Nov 29, 2010 at 9:32 PM, Jonathan Ellis jbel...@gmail.com wrote:

 On Mon, Nov 29, 2010 at 11:26 PM, Narendra Sharma
 narendra.sha...@gmail.com wrote:
  Thanks Jonathan.
 
  Couple of more questions:
  1. Is there any technical limit on the number of secondary indexes that
 can
  be created?

 Just as with traditional databases, the more indexes there are the
 slower writes to that CF will be.

  2. Is it possible to execute join queries spanning multiple secondary
  indexes?

 What do secondary indexes have to do with joins?


 For eg if I want to get all employees that are male and have age = 35
 years. How can secondary indexes be useful in such scenario?


 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of Riptano, the source for professional Cassandra support
 http://riptano.com





Re: Achieving isolation on single row modifications with batch_mutate

2010-11-29 Thread Tyler Hobbs
In this case, it sounds like you should combine columns A and B if you
are writing them both at the same time, reading them both at the same
time, and need them to be consistent.

Obviously, you're probably dealing with more than two columns here, but
there's generally not any value in splitting something into multiple columns
if you're always writing and reading all of them at the same time.

Or are you talking about chunking huge blobs across a row?

- Tyler

On Sat, Nov 27, 2010 at 10:12 AM, E S tr1skl...@yahoo.com wrote:

 I'm trying to figure out the best way to achieve single row modification
 isolation for readers.

 As an example, I have 2 rows (1,2) with 2 columns (a,b).  If I modify both
 rows,
 I don't care if the user sees the write operations completed on 1 and not
 on 2
 for a short time period (seconds).  I also don't care if when reading row 1
 the
 user gets the new value, and then on a re-read gets the old value (within a
 few
 seconds).  Because of this, I have been planning on using a consistency
 level of
 one.

 However, if I modify both columns A,B on a single row, I need both changes
 on
 the row to be visible/invisible atomically.  It doesn't matter if they both
 become visible and then both invisible as the data propagates across nodes,
 but
 a half-completed state on an initial read will basically be returning
 corrupt
 data given my apps consistency requirements.  My understanding from the FAQ
 that
 this single row multicolumn change provides no read isolation, so I will
 have
 this problem.  Is this correct?  If so:

 Question 1:  Is there a way to get this type of isolation without using a
 distributed locking mechanism like cages?

 Question 2:  Are there any plans to implement this type of isolation within
 Cassandra?

 Question 3:  If I went with a distributed locking mechanism, what
 consistency
 level would I need to use with Cassandra?  Could I still get away with a
 consistency level of one?  It seems that if the initial write is done in a
 non-isolated way, but if cross-node row synchronizations are done all or
 nothing, I could still use one.

 Question 4:  Does anyone know of a good c# alternative to cages/zookeeper?

 Thanks for any help with this!







Re: get_count - cassandra 0.7.x predicate limit bug?

2010-11-29 Thread Tyler Hobbs
What error are you getting?

Remember, get_count() is still just about as much work for cassandra as
getting the whole row; the only advantage is it doesn't have to send the
whole row back to the client.

If you're counting 3+ million columns frequently, it's time to take a look
at counters.

- Tyler

On Fri, Nov 26, 2010 at 10:33 AM, Marcin mar...@33concept.com wrote:

 Hi guys,

 I have a key with 3million+ columns but when I am trying to run get_count
 on it its getting me error if setting limit more than 46000+ any ideas?

 In previous API there was no predicate at all so it was simply counting
 number of columns now its not so simple any more.

 Please let me know if that is a bug or I do something wrong.


 cheers,
 /Marcin



Re: Updating Cascal

2010-11-29 Thread Tyler Hobbs
Are you sure you're using the same key for batch_mutate() and get_slice()?
They appear different in the logs.

- Tyler

On Thu, Nov 25, 2010 at 10:14 AM, Michael Fortin mi...@m410.us wrote:

 Hello,
 I forked Cascal  (Scala based client for cassandra) and I'm attempting to
 update it to cassandra 0.7.  I have it partially working, but I'm getting
 stuck on a few areas.

 I have most of the unit tests working from the original code, but I'm
 having an issue with batch_mutate(keyToFamilyMutations, consistency) .  Does
 the log output mean anything?  I can't figure out why the columns are not
 getting inserted.  If I change th code from a batch_mutate to an
 insert(family, parent, column, consistency) it works.

 ### keyToFamilyMutations: {java.nio.HeapByteBuffer[pos=0 lim=16
 cap=16]={Standard=[Mutation(column_or_supercolumn:ColumnOrSuperColumn(column:Column(name:43
 6F 6C 75 6D 6E 2D 61 2D 31, value:56 61 6C 75 65 2D 31,
 timestamp:1290662894466035))),
 Mutation(column_or_supercolumn:ColumnOrSuperColumn(column:Column(name:43 6F
 6C 75 6D 6E 2D 61 2D 33, value:56 61 6C 75 65 2D 33,
 timestamp:1290662894467942))),
 Mutation(column_or_supercolumn:ColumnOrSuperColumn(column:Column(name:43 6F
 6C 75 6D 6E 2D 61 2D 32, value:56 61 6C 75 65 2D 32,
 timestamp:1290662894467915)))]}}
 DEBUG 2010-11-25 00:28:14,534 [org.apache.cassandra.thrift.CassandraServer
 pool-1-thread-2] batch_mutate
 DEBUG 2010-11-25 00:28:14,583 [org.apache.cassandra.service.StorageProxy
 pool-1-thread-2] insert writing local RowMutation(keyspace='Test',
 key='ccfd5520f85411df858a001c4209', modifications=[Standard])

 DEBUG 2010-11-25 00:28:14,599 [org.apache.cassandra.thrift.CassandraServer
 pool-1-thread-2] get_slice
 DEBUG 2010-11-25 00:28:14,605 [org.apache.cassandra.service.StorageProxy
 pool-1-thread-2] weakread reading SliceFromReadCommand(table='Test',
 key='5374616e64617264',
 column_parent='QueryPath(columnFamilyName='Standard',
 superColumnName='null', columnName='null')', start='', finish='',
 reversed=false, count=2147483647) locally
 DEBUG 2010-11-25 00:28:14,608 [org.apache.cassandra.service.StorageProxy
 ReadStage:2] weakreadlocal reading SliceFromReadCommand(table='Test',
 key='5374616e64617264',
 column_parent='QueryPath(columnFamilyName='Standard',
 superColumnName='null', columnName='null')', start='', finish='',
 reversed=false, count=2147483647)
 ### get_slice: []


 The code looks like:
  println(keyToFamilyMutations: %s.format(keyToFamilyMutations))
  client.batch_mutate(keyToFamilyMutations, consistency)
  …
  client.client.get_slice(…)

 keyspaces:
- name: Test
  replica_placement_strategy:
 org.apache.cassandra.locator.SimpleStrategy
  replication_factor: 1
  column_families:
- {name: Standard, compare_with: BytesType}



 Thanks,
 Mike


partial matching of keys

2010-11-29 Thread Arijit Mukherjee
Hi All

I was wondering if it is possible to match keys partially while
searching in Cassandra.

I have a requirement where I'm storing a large number of records, the
key being something like A|B|T where A and B are mobile numbers and
T is the time-stamp (the time when A called B). Such format ensure the
uniqueness of the keys. Now if I want to search for all records where
A called B, I would like to do a partial match with A|B. Is this
possible?

I've another small question - where can I find some complete examples
of creating a cluster and communicating with it (for
insertion/deletion of records) using Hector or Pelops? So far, I've
been doing this via the Thrift interface, but it's becoming illegible
now...

Thanks in advance...

Regards
Arijit

-- 
And when the night is cloudy,
There is still a light that shines on me,
Shine on until tomorrow, let it be.


Re: Introduction to Cassandra

2010-11-29 Thread Jim Morrison
Really great introduction, thanks Aaron.Bookmarked for the team. 

J. 

Sent from my iPhone

On 29 Nov 2010, at 21:11, Aaron Morton aa...@thelastpickle.com wrote:

 I did a talk last week at the Wellington Rails User Group as a basic 
 introduction to Cassandra. The slides are here 
 http://www.slideshare.net/aaronmorton/well-railedcassandra24112010-5901169 if 
 anyone is interested. 
 
 Cheers
 Aaron
 


Re: partial matching of keys

2010-11-29 Thread Tyler Hobbs
Yes, you can basically do this two ways:

First, you can use an OrderPreservingPartitioner.  This stores your keys in
order, so you can grab the range of keys that begin with 'A|B'.  Because of
the drawbacks of OPP (unbalanced ring, hotspots), you almost certainly don't
want to do this.

Second, you take advantage of column name sorting.  For example, you can
have a row for all of the calls that A has made; each column name can be
something like 'B|T'. This allows you to quickly get all of the times when A
called
B in chronological order.  (You can have a second row or column family and
swap B
and T's position if you're more interested in time slices.)  This is very
much like the
Twitter clone, Twissandra:

https://github.com/ericflo/twissandra
http://twissandra.com/

As for examples, there are Hector examples here:

https://github.com/zznate/hector-examples

- Tyler

On Tue, Nov 30, 2010 at 12:11 AM, Arijit Mukherjee ariji...@gmail.comwrote:

 Hi All

 I was wondering if it is possible to match keys partially while
 searching in Cassandra.

 I have a requirement where I'm storing a large number of records, the
 key being something like A|B|T where A and B are mobile numbers and
 T is the time-stamp (the time when A called B). Such format ensure the
 uniqueness of the keys. Now if I want to search for all records where
 A called B, I would like to do a partial match with A|B. Is this
 possible?

 I've another small question - where can I find some complete examples
 of creating a cluster and communicating with it (for
 insertion/deletion of records) using Hector or Pelops? So far, I've
 been doing this via the Thrift interface, but it's becoming illegible
 now...

 Thanks in advance...

 Regards
 Arijit

 --
 And when the night is cloudy,
 There is still a light that shines on me,
 Shine on until tomorrow, let it be.