Re: Basic question on a write operation immediately followed by a read

2011-01-25 Thread Roshan Dawrani
2011/1/25 Wangpei (Peter) peter.wang...@huawei.com

  for your 1-node cluster,  ANY is the only consistency level that client
 may returns BEFORE node write to memory table.

 And read op on the node read both the memory table and SSTable.



 It real puzzle me. :(


Please don't be puzzled just yet. :-)

As I said from the beginning, I wasn't confirming yet that reads were in
fact missing the writes. I have just observed that kind of behavior at my
app level and I wanted to understand what was the possibility of it
happening from the Cassdandra side.

If reads were sure to read what was written (with QUORAM level, let's say),
then I can look at other causes inside the app.


Re: Schema Question

2011-01-25 Thread Andy Burgess


  
  
Aaron,

A question about one of your general points, "do not create CF's on
the fly" - what, exactly, does this mean? Do you mean named column
families, like "BlogEntries" from Sam's example, or do you mean
column family keys, like "i-got-a-new-guitar"? If it's the latter,
then could you please explain why not to do this? My application is
based around creating row keys on the fly, so I'd like to know ahead
of time if I'm creating potential trouble for myself.

To be honest, if you do mean specifically column families and not
column family keys, then I don't even understand how you would go
about creating those on-the-fly anyway. Don't they have to be
pre-configured in storage-conf.xml?

Thanks,
Andy.

On 25/01/11 00:39, Aaron Morton wrote:

  Sam,
  The best advice is to jump in and try any schema If you are
just starting out, start simple you're going to re-write it
several times. Worry about scale later, in most cases it's going
to work.
  
  
  Some general points:
  
  
  - do not create CF's on the fly.
  - work out your common read requests and denormalise to
support these, the writes will be fast enough.
  - try to get each read request to be resolved by reading from
a single CF (not a rule, just a guideline)
  - avoid big super columns.
  - this may also be interestinghttp://www.rackspacecloud.com/blog/2010/05/12/cassandra-by-example/
  
  
  

  
  If you are happy with the one in the article start with that
and see how it works with you app. See how it works for your
read activities.
  
  
  Hope that helps.
  Aaron
  
  
  
On 25 Jan, 2011,at 12:47 PM, Sam Hodgson
hodgson_...@hotmail.com wrote:

  
  

  
Hi all,


Im brand new to Cassandra - im migrating from MySql for a
large forum site and would be grateful if anyone can give me
some basic pointers on schema design, or any recommended
documentation. 

The example used in
http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model
is very close if not exactly what I need for my main CF:

!--
ColumnFamily: BlogEntries
This is where all the blog entries will go:

Row Key + post's slug (the seo friendly portion of the uri)
Column Name: an attribute for the entry (title, body, etc)
Column Value: value of the associated attribute

Access: grab an entry by slug (always fetch all Columns for Row)

fyi: tags is a denormalization... its a comma separated list of tags.
im not using json in order to not interfere with our
notation but obviously you could use anything as long as your app
knows how to deal w/ it

BlogEntries : { // CF
i-got-a-new-guitar : { // row key - the unique "slug" of the entry.
title: This is a blog entry about my new, awesome guitar,
body: this is a cool entry. etc etc yada yada
author: Arin Sarkissian  // a row key into the Authors CF
tags: life,guitar,music  // comma sep list of tags (basic denormalization)
pubDate: 1250558004  // unixtime for publish date
slug: i-got-a-new-guitar
},
// all other entries
another-cool-guitar : {
...
tags: guitar,
slug: another-cool-guitar
},
scream-is-the-best-movie-ever : {
..
tags: movie,horror,
slug: scream-is-the-best-movie-ever
}
}
--
ColumnFamily CompareWith="BytesType" Name="BlogEntries"/

How well would this scale? Say you are storing 5 million posts and looking to scale that up 
would it be better to segment them into several column families and if so to what extent? 

I could create column families to store posts for each category however i'd end up with thousands of CF's.  
Saying that the data would then be stored in a very sorted manner for querying/presenting.

My db is very write heavy and growing fast, Cassandra sounds like the best solution.
Any advice is greatly appreciated!! 

Thanks

Sam





  


-- 
Andy Burgess
Principal Development Engineer
Application Delivery
WorldPay Ltd.
270-289 Science Park, Milton Road
Cambridge, CB4 0WE, United Kingdom (Depot Code: 024)
Office: +44 (0)1223 706 779| Mobile: +44 (0)7909 534 940
andy.burg...@worldpay.com

  

WorldPay (UK) Limited, Company No. 07316500. Registered Office: 55 Mansell 
Street, London E1 8AN

Authorised and regulated by the Financial Services Authority.

‘WorldPay Group’ means WorldPay (UK) Limited and its affiliates from time to 
time.  A reference to an “affiliate” 

Re: Upgrading from 0.6 to 0.7.0

2011-01-25 Thread Daniel Josefsson
Yes, it should be possible to try.

We have not yet quite decided which way to go, think operations won't be
happy with upgrading both server and client at the same time.

Either we upgrade to 0.7.0 (currently does not look very likely), or we go
to 0.6.9 and patch with TTL. I'm not too sure what a possible future upgrade
would look like if we use the TTL patch, though.

/Daniel

2011/1/21 Aaron Morton aa...@thelastpickle.com

 Yup, you can use diff ports and you can give them different cluster names
 and different seed lists.

 After you upgrade the second cluster partition the data should repair
 across, either via RR or the HHs that were stored while the first partition
 was down. Easiest thing would be to run node tool repair. Then a clean up to
 remove any leftover data.

 AFAIK file formats are compatible. But drain the nodes before upgrading to
 clear the log.

 Can you test this on a non production system?

 Aaron
 (we really need to write some upgrade docs:))

 On 21/01/2011, at 10:42 PM, Dave Gardner dave.gard...@imagini.net wrote:

 What about executing writes against both clusters during the changeover?
 Interested in this topic because we're currently thinking about the same
 thing - how to upgrade to 0.7 without any interruption.

 Dave

 On 21 January 2011 09:20, Daniel Josefsson  jid...@gmail.com
 jid...@gmail.com wrote:

 No, what I'm thinking of is having two clusters (0.6 and 0.7) running on
 different ports so they can't find each other. Or isn't that configurable?

 Then, when I have the two clusters, I could upgrade all of the clients to
 run against the new cluster, and finally upgrade the rest of the Cassandra
 nodes.

 I don't know how the new cluster would cope with having new data in the
 old cluster when they are upgraded though.

 /Daniel

 2011/1/20 Aaron Morton  aa...@thelastpickle.comaa...@thelastpickle.com
 

 I'm not sure if your suggesting running a mixed mode cluster there, but
 AFAIK the changes to the internode protocol prohibit this. The nodes will
 probable see each either via gossip, but the way the messages define their
 purpose (their verb handler) has been changed.

 Out of interest which is more painful, stopping the cluster and upgrading
 it or upgrading your client code?

 Aaron

 On 21/01/2011, at 12:35 AM, Daniel Josefsson  jid...@gmail.com
 jid...@gmail.com wrote:

 In our case our replication factor is more than half the number of nodes
 in the cluster.

 Would it be possible to do the following:

- Upgrade half of them
- Change Thrift Port and inter-server port (is this the
storage_port?)
- Start them up
- Upgrade clients one by one
- Upgrade the the rest of the servers

 Or might we get some kind of data collision when still writing to the old
 cluster as the new storage is being used?

 /Daniel






Re: [mapreduce] ColumnFamilyRecordWriter hidden reuse

2011-01-25 Thread Mick Semb Wever
On Tue, 2011-01-25 at 09:37 +0100, Patrik Modesto wrote:
 While developing really simple MR task, I've found that a
 combiantion of Hadoop optimalization and Cassandra
 ColumnFamilyRecordWriter queue creates wrong keys to send to
 batch_mutate(). 

I've seen similar behaviour (junk rows being written), although my keys
are always a result from
  LongSerializer.get().toByteBuffer(key)


i'm interested in looking into it - but can you provide a code example? 

  From what i can see TextOutputFormat.LineRecordWriter.write(..)
doesn't clone anything, but it does write it out immediately.
  While ColumnFamilyRecordWriter does batch the mutations up as you say,
it takes a ByteBuffer as a key, why/how are you re-using this
client-side (arn't you creating a new ByteBuffer each call to
write(..))?

~mck

-- 
Never let your sense of morals get in the way of doing what's right.
Isaac Asimov 
| http://semb.wever.org | http://sesat.no
| http://finn.no   | Java XSS Filter



signature.asc
Description: This is a digitally signed message part


Re: [mapreduce] ColumnFamilyRecordWriter hidden reuse

2011-01-25 Thread Patrik Modesto
Hi Mick,

attached is the very simple MR job, that deletes expired URL from my
test Cassandra DB. The keyspace looks like this:

Keyspace: Test:
  Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
Replication Factor: 2
  Column Families:
ColumnFamily: Url2
  Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type
  Row cache size / save period: 0.0/0
  Key cache size / save period: 20.0/3600
  Memtable thresholds: 4.7015625/1003/60
  GC grace seconds: 864000
  Compaction min/max thresholds: 4/32
  Read repair chance: 1.0
  Built indexes: []

In the CF the key is URL and inside there are some data. My MR job
needs just expire_date which is int64 timestamp. For now I store it
as a string because I use Python and C++ to manipulate the data as
well.

For the MR Job to run you need a patch I did. You can find it here:
https://issues.apache.org/jira/browse/CASSANDRA-2014

The atttached file contains the working version with cloned key in
reduce() method. My other aproache was:
[code]
context.write(ByteBuffer.wrap(key.getBytes(), 0, key.getLength()),
Collections.singletonList(getMutation(key)));
[/code]
Which produce junk keys.

Best regards,
Patrik

import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.ByteOrder;
import java.nio.LongBuffer;
import java.util.*;

import org.apache.cassandra.avro.Mutation;
import org.apache.cassandra.avro.Deletion;
import org.apache.cassandra.avro.SliceRange;
import org.apache.cassandra.hadoop.ColumnFamilyOutputFormat;

import org.apache.cassandra.db.IColumn;
import org.apache.cassandra.hadoop.ColumnFamilyInputFormat;
import org.apache.cassandra.hadoop.ConfigHelper;
import org.apache.cassandra.thrift.SlicePredicate;
import org.apache.cassandra.utils.ByteBufferUtil;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

public class ContextExpirator extends Configured implements Tool
{
static final String KEYSPACE = Test;
static final String COLUMN_FAMILY = Url2;
static final String OUTPUT_COLUMN_FAMILY = Url2;
static final String COLUMN_VALUE = expire_date;

public static void main(String[] args) throws Exception
{
// Let ToolRunner handle generic command-line options
ToolRunner.run(new Configuration(), new ContextExpirator(), args);
System.exit(0);
}

public static class UrlFilterMapper
extends MapperByteBuffer, SortedMapByteBuffer, IColumn, Text, NullWritable
{
private final static NullWritable nic = NullWritable.get();
private ByteBuffer sourceColumn;
private static long now;

protected void setup(org.apache.hadoop.mapreduce.Mapper.Context context)
throws IOException, InterruptedException
{
sourceColumn = ByteBuffer.wrap(COLUMN_VALUE.getBytes());
now = System.currentTimeMillis() / 1000; // convert from ms
}

public void map(ByteBuffer key, SortedMapByteBuffer, IColumn columns, Context context)
throws IOException, InterruptedException
{
IColumn column = columns.get(sourceColumn);
if (column == null) {
return;
}

Text tKey = new Text(ByteBufferUtil.string(key));
Long value = Long.decode(ByteBufferUtil.string(column.value()));

if(now  value) {
context.write(tKey, nic);
}
}
}

public static class RemoveUrlReducer
extends ReducerText, NullWritable, ByteBuffer, ListMutation
{
public void reduce(Text key, IterableNullWritable values, Context context)
throws IOException, InterruptedException
{
ByteBuffer bbKey = ByteBufferUtil.clone(ByteBuffer.wrap(key.getBytes(), 0, key.getLength()));
context.write(bbKey, Collections.singletonList(getMutation()));
}

private static Mutation getMutation()
{
Deletion d = new Deletion();
d.timestamp = System.currentTimeMillis();

Mutation m = new Mutation();
m.deletion = d;

return m;
}
}

public int run(String[] args) throws Exception
{
Job job = new Job(getConf(), context_expitator);
job.setJarByClass(ContextExpirator.class);

job.setInputFormatClass(ColumnFamilyInputFormat.class);
ConfigHelper.setInputColumnFamily(job.getConfiguration(), KEYSPACE, COLUMN_FAMILY);

job.setMapperClass(UrlFilterMapper.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(NullWritable.class);


Re: Schema Question

2011-01-25 Thread David McNelis
I'm fairly certain Aaron is referring to named families like BlogEntries,
not named columns (i-got-a-new-guitar).

On Tue, Jan 25, 2011 at 4:37 AM, Andy Burgess
andy.burg...@rbsworldpay.comwrote:

  Aaron,

 A question about one of your general points, do not create CF's on the
 fly - what, exactly, does this mean? Do you mean named column families,
 like BlogEntries from Sam's example, or do you mean column family keys,
 like i-got-a-new-guitar? If it's the latter, then could you please explain
 why not to do this? My application is based around creating row keys on the
 fly, so I'd like to know ahead of time if I'm creating potential trouble for
 myself.

 To be honest, if you do mean specifically column families and not column
 family keys, then I don't even understand how you would go about creating
 those on-the-fly anyway. Don't they have to be pre-configured in
 storage-conf.xml?

 Thanks,
 Andy.


 On 25/01/11 00:39, Aaron Morton wrote:

 Sam,
 The best advice is to jump in and try any schema If you are just starting
 out, start simple you're going to re-write it several times. Worry about
 scale later, in most cases it's going to work.

  Some general points:

  - do not create CF's on the fly.
 - work out your common read requests and denormalise to support these, the
 writes will be fast enough.
 - try to get each read request to be resolved by reading from a single CF
 (not a rule, just a guideline)
 - avoid big super columns.
 - this may also be interesting
 http://www.rackspacecloud.com/blog/2010/05/12/cassandra-by-example/

   If you are happy with the one in the article start with that and see how
 it works with you app. See how it works for your read activities.

  Hope that helps.
 Aaron


 On 25 Jan, 2011,at 12:47 PM, Sam Hodgson 
 hodgson_...@hotmail.comhodgson_...@hotmail.comwrote:

   Hi all,

 Im brand new to Cassandra - im migrating from MySql for a large forum site
 and would be grateful if anyone can give me some basic pointers on schema
 design, or any recommended documentation.

 The example used in
 http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model is very
 close if not exactly what I need for my main CF:

 !--ColumnFamily: BlogEntriesThis is where all the blog entries will 
 go:Row Key + post's slug (the seo friendly portion of the uri)Column 
 Name: an attribute for the entry (title, body, etc)Column Value: value of 
 the associated attributeAccess: grab an entry by slug (always fetch all 
 Columns for Row)fyi: tags is a denormalization... its a comma separated 
 list of tags.im not using json in order to not interfere with our
 notation but obviously you could use anything as long as your appknows 
 how to deal w/ itBlogEntries : { // CFi-got-a-new-guitar : { // 
 row key - the unique slug of the entry.title: This is a blog 
 entry about my new, awesome guitar,body: this is a cool entry. 
 etc etc yada yadaauthor: Arin Sarkissian  // a row key into the 
 Authors CFtags: life,guitar,music  // comma sep list of tags 
 (basic denormalization)pubDate: 1250558004  // unixtime for 
 publish dateslug: i-got-a-new-guitar},// all 
 other entriesanother-cool-guitar : {...tags: 
 guitar,slug: another-cool-guitar},
 scream-is-the-best-movie-ever : {..tags: 
 movie,horror,slug: scream-is-the-best-movie-ever}
 }--ColumnFamily CompareWith=BytesType Name=BlogEntries/
 How well would this scale? Say you are storing 5 million posts and looking to 
 scale that up
 would it be better to segment them into several column families and if so to 
 what extent?

 I could create column families to store posts for each category however i'd 
 end up with thousands of CF's.
 Saying that the data would then be stored in a very sorted manner for 
 querying/presenting.

 My db is very write heavy and growing fast, Cassandra sounds like the best 
 solution.Any advice is greatly appreciated!!

 Thanks

 Sam



 --
 Andy Burgess
 Principal Development Engineer
 Application Delivery
 WorldPay Ltd.
 270-289 Science Park, Milton Road
 Cambridge, CB4 0WE, United Kingdom (Depot Code: 024)
 Office: +44 (0)1223 706 779| Mobile: +44 (0)7909 534 
 940andy.burg...@worldpay.com


 WorldPay (UK) Limited, Company No. 07316500. Registered Office: 55 Mansell
 Street, London E1 8AN

 Authorised and regulated by the Financial Services Authority.

 ‘WorldPay Group’ means WorldPay (UK) Limited and its affiliates from time
 to time.  A reference to an “affiliate” means any Subsidiary Undertaking,
 any Parent Undertaking and any Subsidiary Undertaking of any such Parent
 Undertaking and reference to a “Parent Undertaking” or a “Subsidiary
 Undertaking” is to be construed in accordance with section 1162 of the
 Companies Act 2006, as amended.

 DISCLAIMER: This email and 

Re: client threads locked up - JIRA ISSUE 1594

2011-01-25 Thread Nate McCall
What version of the Thrift API are you using?

(In general, you should use an existing client library rather than
rolling your own - I recommend Hector:
https://github.com/rantav/hector).

On Tue, Jan 25, 2011 at 12:38 AM, Arijit Mukherjee ariji...@gmail.com wrote:
 I'm using Cassandra 0.6.8. I'm not using Hector - it's just raw thrift APIs.

 Arijit

 On 21 January 2011 22:13, Nate McCall n...@riptano.com wrote:
 What versions of Cassandra and Hector? The versions mentioned on this
 ticket are both several releases behind.

 On Fri, Jan 21, 2011 at 3:53 AM, Arijit Mukherjee ariji...@gmail.com wrote:
 Hi All

 I'm facing the same issue as this one mentioned here -
 https://issues.apache.org/jira/browse/CASSANDRA-1594

 Is there any solution or work-around for this?

 Regards
 Arijit


 --
 And when the night is cloudy,
 There is still a light that shines on me,
 Shine on until tomorrow, let it be.





 --
 And when the night is cloudy,
 There is still a light that shines on me,
 Shine on until tomorrow, let it be.



Re: Stress test inconsistencies

2011-01-25 Thread Tyler Hobbs
Try using something higher than -t 1, like -t 100.

- Tyler

On Mon, Jan 24, 2011 at 9:38 PM, Oleg Proudnikov ol...@cloudorange.comwrote:

 Hi All,

 I am struggling to make sense of a simple stress test I ran against the
 latest
 Cassandra 0.7. My server performs very poorly compared to a desktop and
 even a
 notebook.

 Here is the command I execute - a single threaded insert that runs on the
 same
 host as Cassnadra does (I am using new contrib/stress but old py_stress
 produces
 similar results):

 ./stress -t 1 -o INSERT -c 30 -n 1 -i 1

 On a SUSE Linux server with a 4-core Intel XEON I get maximum 30 inserts a
 second with 40ms latency. But on a Windows desktop I get incredible 200-260
 inserts a second with a 4ms latency!!! Even on the smallest MacBook Pro I
 get
 bursts of high throughput - 100+ inserts a second.

 Could you please help me figure out what is wrong with my server? I tried
 several servers actually with the same results. I would appreciate any help
 in
 tracing down the bottleneck. Configuration is the same in all tests with
 the
 server having the advantage of separate physical disks for commitlog and
 data.

 Could you also share with me what numbers you get or what is reasonable to
 expect from this test?

 Thank you very much,
 Oleg


 Here is the output for the Linux server, Windows desktop and MacBook Pro,
 one
 line per second:

 Linux server - INtel XEON X3330 @ 2.666Mhz, 4G RAM, 2G heap

 Created keyspaces. Sleeping 1s for propagation.
 total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
 19,19,19,0.05947368421052632,1
 46,27,27,0.04274074074074074,2
 70,24,24,0.04733,3
 95,25,25,0.04696,4
 119,24,24,0.048208333,5
 147,28,28,0.04189285714285714,7
 177,30,30,0.03904,8
 206,29,29,0.04006896551724138,9
 235,29,29,0.03903448275862069,10

 Windows desktop: Core2 Duo CPU E6550 @ 2.333Mhz, 2G RAM, 1G heap

 Keyspace already exists.
 total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
 147,147,147,0.005292517006802721,1
 351,204,204,0.0042009803921568625,2
 527,176,176,0.006551136363636364,3
 718,191,191,0.005617801047120419,4
 980,262,262,0.00400763358778626,5
 1206,226,226,0.004150442477876107,6
 1416,210,210,0.005619047619047619,7
 1678,262,262,0.0040038167938931295,8

 MacBook Pro: Core2 Duo CPU @ 2.26Mhz, 2G RAM, 1G heap

 Created keyspaces. Sleeping 1s for propagation.
 total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
 0,0,0,NaN,1
 7,7,7,0.21185714285714285,2
 47,40,40,0.026925,3
 171,124,124,0.007967741935483871,4
 258,87,87,0.01206896551724138,6
 294,36,36,0.022444,7
 303,9,9,0.14378,8
 307,4,4,0.2455,9
 313,6,6,0.128,10
 508,195,195,0.007938461538461538,11
 792,284,284,0.0035985915492957746,12
 882,90,90,0.01219,13






Re: [mapreduce] ColumnFamilyRecordWriter hidden reuse

2011-01-25 Thread Mick Semb Wever
On Tue, 2011-01-25 at 14:16 +0100, Patrik Modesto wrote:
 The atttached file contains the working version with cloned key in
 reduce() method. My other aproache was:
 
  context.write(ByteBuffer.wrap(key.getBytes(), 0, key.getLength()),
  Collections.singletonList(getMutation(key)));
 
 Which produce junk keys. 

In fact i have another problem (trying to write an empty byte[], or
something, as a key, which put one whole row out of whack, ((one row in
25 million...))).

But i'm debugging along the same code.

I don't quite understand how the byte[] in 
ByteBuffer.wrap(key.getBytes(),...)
gets clobbered.
Well your key is a mutable Text object, so i can see some possibility
depending on how hadoop uses these objects.
Is there something to ByteBuffer.allocate(..) i'm missing...

btw.
 is d.timestamp = System.currentTimeMillis(); ok?
 shouldn't this be microseconds so that each mutation has a different
timestamp? http://wiki.apache.org/cassandra/DataModel


~mck


-- 
As you go the way of life, you will see a great chasm. Jump. It is not
as wide as you think. Native American Initiation Rite 
| http://semb.wever.org | http://sesat.no
| http://finn.no   | Java XSS Filter

-- 
Everything should be made as simple as possible, but not simpler.
Albert Einstein (William of Ockham) 
| http://semb.wever.org | http://sesat.no
| http://finn.no   | Java XSS Filter


signature.asc
Description: This is a digitally signed message part


Re: Forcing GC w/o jconsole

2011-01-25 Thread buddhasystem

Thanks! It doesn't seem to have any effect on GCing dropped CFs, though.

Maxim

-- 
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Forcing-GC-w-o-jconsole-tp5956747p5960100.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: Stress test inconsistencies

2011-01-25 Thread Oleg Proudnikov
Tyler Hobbs tyler at riptano.com writes:

 Try using something higher than -t 1, like -t 100.- Tyler



Thank you, Tyler!

When I run contrib/stress with a higher thread count, the server does scale to
200 inserts a second with latency of 200ms. At the same time Windows desktop
scales to 900 inserts a second and latency of 120ms. There is a huge difference
that I am trying to understand and eliminate.

In my real life bulk load I have to stay with a single threaded client for the
POC I am doing. The only option I have is to run several client processes... My
real life load is heavier than what contrib/stress does. It takes several days
to bulk load 4 million batch mutations !!! It is really painful :-( Something is
just not right...

Oleg






Re: Stress test inconsistencies

2011-01-25 Thread buddhasystem

Oleg,

I'm a novice at this, but for what it's worth I can't imagine you can have a
_sustained_ 1kHz insertion rate on a single machine which also does some
reads. If I'm wrong, I'll be glad to learn that I was. It just doesn't seem
to square with a typical seek time on a hard drive.

Maxim

-- 
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Stress-test-inconsistencies-tp5957467p5960182.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: Stress test inconsistencies

2011-01-25 Thread Brandon Williams
On Tue, Jan 25, 2011 at 1:23 PM, Oleg Proudnikov ol...@cloudorange.comwrote:

 When I run contrib/stress with a higher thread count, the server does scale
 to
 200 inserts a second with latency of 200ms. At the same time Windows
 desktop
 scales to 900 inserts a second and latency of 120ms. There is a huge
 difference
 that I am trying to understand and eliminate.


Those are really low numbers, are you still testing with 10k rows?  That's
not enough, try 1M to give both JVMs enough time to warm up.

-Brandon


Files not deleted after compaction and GCed

2011-01-25 Thread Ching-Cheng Chen
Using cassandra 0.7.0

The class org.apache.cassandra.io.sstable.SSTableDeletingReference only
remove the -Data.db file, but leave the xxx-Compacted, xxx-Filter.db,
xxx-Index.db and xxx-Statistics.db intact.

And that's the behavior I saw.I ran manual compact then trigger a GC
from jconsole.   The Data.db file got removed but not the others.

Is this the expected behavior?

Regards,

Chen


Re: Stress test inconsistencies

2011-01-25 Thread Oleg Proudnikov
Brandon Williams driftx at gmail.com writes:

 
 On Tue, Jan 25, 2011 at 1:23 PM, Oleg Proudnikov olegp at cloudorange.com
wrote:
 
 When I run contrib/stress with a higher thread count, the server does scale to
 200 inserts a second with latency of 200ms. At the same time Windows desktop
 scales to 900 inserts a second and latency of 120ms. There is a huge 
 difference
 that I am trying to understand and eliminate.
 
 
 Those are really low numbers, are you still testing with 10k rows?  That's not
enough, try 1M to give both JVMs enough time to warm up.
 
 
 -Brandon 
 

I agree, Brandon, the numbers are very low! The warm up does not seem to make
any difference though... There is something that is holding the server back
because the CPU is very low. I am trying to understand where this bottleneck is
on the Linux server. I do not think it is Cassandra's config as I use the same
config on Windows and get much higher numbers as I described.

Oleg




Re: Errors During Compaction

2011-01-25 Thread Aaron Morton
Dan how did you go with this? More joy, less joy or a continuation of the 
current level of joy?

Aaron


On 24/01/2011, at 9:38 AM, Dan Hendry dan.hendry.j...@gmail.com wrote:

 I have run into a strange problem and was hoping for suggestions on how to 
 fix it (0.7.0). When compaction occurs on one node for what appears to be one 
 specific column family, the following error pops up the Cassandra log. 
 Compaction apparently fails and temp files don’t get cleaned up. After a 
 while and what seems to be multiple failed compactions on the CF, the node 
 runs out of disk space and crashes. Not sure if it is a related problem or a 
 function of this being a heavily used column family but after failing to 
 compact, compaction restarts on the same CF exacerbating the issue.
 
  
 
 Problems with this specific node started earlier this weekend when it crashed 
 with and OOM error. This is quite surprising since my memtable thresholds and 
 GC settings have been tuned to run with quite a bit of overhead during normal 
 operation (max heap usage usually = 10 GB on a 12 GB heap, average usage of 
 6-8 GB). I could not find anything abnormal in the logs which would prompt an 
 OOM.
 
  
 
 I will look things over tomorrow and try to provide a bit more information on 
 the problem but as a solution, I was going to wipe out all SSTables for this 
 CF on this node and then run a repair. Far from ideal, is this a reasonable 
 solution?
 
  
 
  
 
 ERROR [CompactionExecutor:1] 2011-01-23 14:10:29,855 
 AbstractCassandraDaemon.java (line 91) Fatal exception in thread 
 Thread[CompactionExecutor:1,1,RMI Runtime]
 
 java.io.IOError: java.io.EOFException: attempted to skip -1983579368 bytes 
 but only skipped 0
 
 at 
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:78)
 
 at 
 org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:178)
 
 at 
 org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:143)
 
 at 
 org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:135)
 
 at 
 org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:38)
 
 at 
 org.apache.commons.collections.iterators.CollatingIterator.set(CollatingIterator.java:284)
 
 at 
 org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:326)
 
 at 
 org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230)
 
 at 
 org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:68)
 
 at 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
 
 at 
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
 
 at 
 org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183)
 
 at 
 org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94)
 
 at 
 org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:323)
 
 at 
 org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:122)
 
 at 
 org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:92)
 
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 
 at java.lang.Thread.run(Thread.java:662)
 
 Caused by: java.io.EOFException: attempted to skip -1983579368 bytes but only 
 skipped 0
 
 at 
 org.apache.cassandra.io.sstable.IndexHelper.skipBloomFilter(IndexHelper.java:52)
 
 at 
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:69)
 
 ... 20 more
 
  
 
 Dan Hendry
 
 (403) 660-2297
 
  


Re: Files not deleted after compaction and GCed

2011-01-25 Thread Jonathan Ellis
No, that is not expected.  All the sstable components are removed in
the same method; did you check the log for exceptions?

On Tue, Jan 25, 2011 at 2:58 PM, Ching-Cheng Chen
cc...@evidentsoftware.com wrote:
 Using cassandra 0.7.0
 The class org.apache.cassandra.io.sstable.SSTableDeletingReference only
 remove the -Data.db file, but leave the xxx-Compacted, xxx-Filter.db,
 xxx-Index.db and xxx-Statistics.db intact.
 And that's the behavior I saw.    I ran manual compact then trigger a GC
 from jconsole.   The Data.db file got removed but not the others.
 Is this the expected behavior?
 Regards,
 Chen



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Does Major Compaction work on dropped CFs? Doesn't seem so.

2011-01-25 Thread Aaron Morton
You can run JConsole on your workstation and connect remotely to the nodes, it does not need to be run on the node itself. Connecting is discussed herehttp://wiki.apache.org/cassandra/MemtableThresholdsand some help for connecting is herehttp://wiki.apache.org/cassandra/JmxGotchasThere is also a Web front end for the JMX servicehttp://wiki.apache.org/cassandra/Operations#Monitoring_with_MX4JAnd a recent discussion on different way to monitor a nodehttp://www.mail-archive.com/user@cassandra.apache.org/msg08100.htmlIf you did through there is some talk about a JMXREST bridge.Hope that helps.AaronOn 25 Jan, 2011,at 04:17 PM, buddhasystem potek...@bnl.gov wrote:
Thanks Aaron. As I remarked earlier (and it seems it not uncommon) none of
the nodes have X11 installed (I think I could arrange this, but it's a bit
of a hassle). So if I understand correctly, jconsole is a X11 app, and I'm
out of luck with that.

I would agree with you that having a proper nodetool command to zap the data
you know you don't need, would be quite ideal. The reason I'm so retentive
about it is that I plan to test scaling up to 250 million rows, and disk
space matters.
-- 
View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Does-Major-Compaction-work-on-dropped-CFs-Doesn-t-seem-so-tp5946031p5957426.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.


Re: Files not deleted after compaction and GCed

2011-01-25 Thread Jonathan Ellis
the other component types are deleted by this line:

SSTable.delete(desc, components);

On Tue, Jan 25, 2011 at 3:11 PM, Ching-Cheng Chen
cc...@evidentsoftware.com wrote:
 Nope, no exception at all.
 But if the same class
 (org.apache.cassandra.io.sstable.SSTableDeletingReference) is responsible
 for delete other files, then that's not right.
 I checked the source code for SSTableDeletingReference, doesn't looks like
 it will delete other files type.
 Regards,
 Chen

 On Tue, Jan 25, 2011 at 4:05 PM, Jonathan Ellis jbel...@gmail.com wrote:

 No, that is not expected.  All the sstable components are removed in
 the same method; did you check the log for exceptions?

 On Tue, Jan 25, 2011 at 2:58 PM, Ching-Cheng Chen
 cc...@evidentsoftware.com wrote:
  Using cassandra 0.7.0
  The class org.apache.cassandra.io.sstable.SSTableDeletingReference only
  remove the -Data.db file, but leave the xxx-Compacted,
  xxx-Filter.db,
  xxx-Index.db and xxx-Statistics.db intact.
  And that's the behavior I saw.    I ran manual compact then trigger a GC
  from jconsole.   The Data.db file got removed but not the others.
  Is this the expected behavior?
  Regards,
  Chen



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com





-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Schema Question

2011-01-25 Thread Aaron Morton
Yeah, I was talking about create a ColumnFamily definition via the API. Not inserting data into an already defined column family.The recommened approach to creating your schema is via the build in bin/cassandra-cli command line tool. It has loads of build in help and here is an example of how to create a keyspacehttp://www.mail-archive.com/user@cassandra.apache.org/msg09146.htmlLet me know how you get on.AaronOn 26 Jan, 2011,at 02:28 AM, David McNelis dmcne...@agentisenergy.com wrote:I'm fairly certain Aaron is referring to named families like BlogEntries, not named columns (i-got-a-new-guitar). On Tue, Jan 25, 2011 at 4:37 AM, Andy Burgess andy.burg...@rbsworldpay.com wrote:


  

  
  
Aaron,

A question about one of your general points, "do not create CF's on
the fly" - what, exactly, does this mean? Do you mean named column
families, like "BlogEntries" from Sam's example, or do you mean
column family keys, like "i-got-a-new-guitar"? If it's the latter,
then could you please explain why not to do this? My application is
based around creating row keys on the fly, so I'd like to know ahead
of time if I'm creating potential trouble for myself.

To be honest, if you do mean specifically column families and not
column family keys, then I don't even understand how you would go
about creating those on-the-fly anyway. Don't they have to be
pre-configured in storage-conf.xml?

Thanks,
Andy.

On 25/01/11 00:39, Aaron Morton wrote:

  Sam,
  The best advice is to jump in and try any schema If you are
just starting out, start simple you're going to re-write it
several times. Worry about scale later, in most cases it's going
to work.
  
  
  Some general points:
  
  
  - do not create CF's on the fly.
  - work out your common read requests and denormalise to
support these, the writes will be fast enough.
  - try to get each read request to be resolved by reading from
a single CF (not a rule, just a guideline)
  - avoid big super columns.
  - this may also be interestinghttp://www.rackspacecloud.com/blog/2010/05/12/cassandra-by-example/
  
  
  
  If you are happy with the one in the article start with that
and see how it works with you app. See how it works for your
read activities.
  
  
  Hope that helps.
  Aaron
  
  
  
On 25 Jan, 2011,at 12:47 PM, Sam Hodgson
hodgson_...@hotmail.com wrote:

  
  

  
Hi all,


Im brand new to Cassandra - im migrating from MySql for a
large forum site and would be grateful if anyone can give me
some basic pointers on schema design, or any recommended
documentation. 

The example used in
http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model
is very close if not exactly what I need for my main CF:

!--
ColumnFamily: BlogEntries
This is where all the blog entries will go:

Row Key + post's slug (the seo friendly portion of the uri)
Column Name: an attribute for the entry (title, body, etc)
Column Value: value of the associated attribute

Access: grab an entry by slug (always fetch all Columns for Row)

fyi: tags is a denormalization... its a comma separated list of tags.
im not using json in order to not interfere with our
notation but obviously you could use anything as long as your app
knows how to deal w/ it

BlogEntries : { // CF
i-got-a-new-guitar : { // row key - the unique "slug" of the entry.
title: This is a blog entry about my new, awesome guitar,
body: this is a cool entry. etc etc yada yada
author: Arin Sarkissian  // a row key into the Authors CF
tags: life,guitar,music  // comma sep list of tags (basic denormalization)
pubDate: 1250558004  // unixtime for publish date
slug: i-got-a-new-guitar
},
// all other entries
another-cool-guitar : {
...
tags: guitar,
slug: another-cool-guitar
},
scream-is-the-best-movie-ever : {
..
tags: movie,horror,
slug: scream-is-the-best-movie-ever
}
}
--
ColumnFamily CompareWith="BytesType" Name="BlogEntries"/

How well would this scale? Say you are storing 5 million posts and looking to scale that up 
would it be better to segment them into several column families and if so to what extent? 

I could create column families to store posts for each category however i'd end up with thousands of CF's.  
Saying that the data would then be stored in a very sorted manner for querying/presenting.

My db is very write heavy and growing fast, Cassandra sounds like 

Re: Stress test inconsistencies

2011-01-25 Thread Anthony John
Look at iostat -x 10 10 when he active par tof your test is running. there
should be something called svc_t - that should be in the 10ms range, and
await should be low.

Will tell you if IO is slow, or if IO is not being issued.

Also, ensure that you ain't swapping with something like swapon -s

On Tue, Jan 25, 2011 at 3:04 PM, Oleg Proudnikov ol...@cloudorange.comwrote:

 buddhasystem potekhin at bnl.gov writes:

 
 
  Oleg,
 
  I'm a novice at this, but for what it's worth I can't imagine you can
 have a
  _sustained_ 1kHz insertion rate on a single machine which also does some
  reads. If I'm wrong, I'll be glad to learn that I was. It just doesn't
 seem
  to square with a typical seek time on a hard drive.
 
  Maxim
 

 Maxim,

 As I understand during inserts Cassandra should not be constrained by
 random
 seek time as it uses sequential writes. I do get high numbers on Windows
 but
 there is something that is holding back my Linux server. I am trying to
 understand what it is.

 Oleg






Re-partitioning the cluster with nodetool: what's happening?

2011-01-25 Thread buddhasystem

I'm trying re-partition my 4-node cluster to make the load exactly 25% on
each node.
As per recipes found in documentation, I calculate:
 for x in xrange(4):
... print 2**127/4*x
...
0
42535295865117307932921825928971026432
85070591730234615865843651857942052864
127605887595351923798765477786913079296

And I need to move the first one to 0, then the second one to
42535295865117307932921825928971026432 etc.

Once I start the procedure, I see no progress when I look at nodetool
netstats. Nothing's happening. What am I doing wrong?

Thanks,

Maxim

-- 
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Re-partitioning-the-cluster-with-nodetool-what-s-happening-tp5960843p5960843.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: Re-partitioning the cluster with nodetool: what's happening?

2011-01-25 Thread buddhasystem

Correction -- what I meant to say that I do see announcements about streaming
in the output, but these are stuck at 0%.

-- 
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Re-partitioning-the-cluster-with-nodetool-what-s-happening-tp5960843p5960851.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


get_range_slices getting deleted rows

2011-01-25 Thread Nick Santini
Hi,
I'm trying a test scenario where I create 100 rows in a CF, then
use get_range_slices to get all the rows, and I get 100 rows, so far so good
then after the test I delete the rows using remove but without a column or
super column, this deletes the row, I can confirm that cos if I try to get
it with get_slice using the key I get nothing

but then if I do get_range_slice again, where the range goes between new
byte[0] and new byte[0] (therefore returning everything), I still get the
100 row keys

is that expected to be?

thanks

Nicolas Santini


Re: get_range_slices getting deleted rows

2011-01-25 Thread Narendra Sharma
Yes. See this http://wiki.apache.org/cassandra/FAQ#range_ghosts

-Naren

On Tue, Jan 25, 2011 at 2:59 PM, Nick Santini nick.sant...@kaseya.comwrote:

 Hi,
 I'm trying a test scenario where I create 100 rows in a CF, then
 use get_range_slices to get all the rows, and I get 100 rows, so far so good
 then after the test I delete the rows using remove but without a column
 or super column, this deletes the row, I can confirm that cos if I try to
 get it with get_slice using the key I get nothing

 but then if I do get_range_slice again, where the range goes between new
 byte[0] and new byte[0] (therefore returning everything), I still get the
 100 row keys

 is that expected to be?

 thanks

 Nicolas Santini



Fwd: CFP - Berlin Buzzwords 2011 - Search, Score, Scale

2011-01-25 Thread David G. Boney
This might interest the Cassandra community.
-
Sincerely,
David G. Boney
dbon...@semanticartifacts.com
http://www.semanticartifacts.com




Begin forwarded message:

 From: Isabel Drost isa...@apache.org
 Date: January 25, 2011 2:53:28 PM CST
 To: u...@mahout.apache.org
 Cc: gene...@lucene.apache.org, gene...@hadoop.apache.org, 
 u...@hbase.apache.org, solr-u...@lucene.apache.org, 
 java-u...@lucene.apache.org, u...@nutch.apache.org
 Subject: CFP - Berlin Buzzwords 2011 - Search, Score, Scale
 Reply-To: u...@mahout.apache.org
 Reply-To: isa...@apache.org
 
 This is to announce the Berlin Buzzwords 2011. The second edition of the 
 successful conference on scalable and open search, data processing and data 
 storage in Germany, taking place in Berlin.
 
 Call for Presentations Berlin Buzzwords
http://berlinbuzzwords.de
   Berlin Buzzwords 2011 - Search, Store, Scale
 6/7 June 2011
 The event will comprise presentations on scalable data processing. We invite 
 you to submit talks on the topics:
* IR / Search - Lucene, Solr, katta or comparable solutions
* NoSQL - like CouchDB, MongoDB, Jackrabbit, HBase and others
* Hadoop - Hadoop itself, MapReduce, Cascading or Pig and relatives
* Closely related topics not explicitly listed above are welcome. We are
  looking for presentations on the implementation of the systems 
 themselves,
  real world applications and case studies.
 
 Important Dates (all dates in GMT +2)
* Submission deadline: March 1st 2011, 23:59 MEZ
* Notification of accepted speakers: March 22th, 2011, MEZ.
* Publication of final schedule: April 5th, 2011.
* Conference: June 6/7. 2011
 High quality, technical submissions are called for, ranging from principles 
 to practice. We are looking for real world use cases, background on the 
 architecture of specific projects and a deep dive into architectures built on 
 top of e.g. Hadoop clusters.
 
 Proposals should be submitted at http://berlinbuzzwords.de/content/cfp-0 no 
 later than March 1st, 2011. Acceptance notifications will be sent out soon 
 after the submission deadline. Please include your name, bio and email, the 
 title of the talk, a brief abstract in English language. Please indicate 
 whether you want to give a lightning (10min), short (20min) or long (40min) 
 presentation and indicate the level of experience with the topic your 
 audience should have (e.g. whether your talk will be suitable for newbies or 
 is targeted for experienced users.) If you'd like to pitch your brand new 
 product in your talk, please let us know as well - there will be extra space 
 for presenting new ideas, awesome products and great new projects.
 
 The presentation format is short. We will be enforcing the schedule 
 rigorously.
 
 If you are interested in sponsoring the event (e.g. we would be happy to 
 provide videos after the event, free drinks for attendees as well as an 
 after-show party), please contact us.
 
 Follow @hadoopberlin on Twitter for updates. Tickets, news on the conference, 
 and the final schedule are be published at http://berlinbuzzwords.de.
 
 Program Chairs: Isabel Drost, Jan Lehnardt, and Simon Willnauer.
 Please re-distribute this CfP to people who might be interested.
 If you are local and wish to meet us earlier, please note that this Thursday 
 evening there will be an Apache Hadoop Get Together (videos kindly sponsored 
 by Cloudera, venue kindly provided for free by Zanox) featuring talks on 
 Apache Hadoop in production as well as news on current Apache Lucene 
 developments.
 
 Contact us at:
 
 newthinking communications 
 GmbH Schönhauser Allee 6/7 
 10119 Berlin, 
 Germany 
 Julia Gemählich
 Isabel Drost 
 +49(0)30-9210 596



Re: get_range_slices getting deleted rows

2011-01-25 Thread Nick Santini
thanks,
so I need to check the returned slice for the key to verify that is a valid
row and not a deleted one?

Nicolas Santini



On Wed, Jan 26, 2011 at 12:16 PM, Narendra Sharma narendra.sha...@gmail.com
 wrote:

 Yes. See this http://wiki.apache.org/cassandra/FAQ#range_ghosts

 -Naren


 On Tue, Jan 25, 2011 at 2:59 PM, Nick Santini nick.sant...@kaseya.comwrote:

 Hi,
 I'm trying a test scenario where I create 100 rows in a CF, then
 use get_range_slices to get all the rows, and I get 100 rows, so far so good
 then after the test I delete the rows using remove but without a column
 or super column, this deletes the row, I can confirm that cos if I try to
 get it with get_slice using the key I get nothing

 but then if I do get_range_slice again, where the range goes between new
 byte[0] and new byte[0] (therefore returning everything), I still get the
 100 row keys

 is that expected to be?

 thanks

 Nicolas Santini





RE: Errors During Compaction

2011-01-25 Thread Dan Hendry
Limited joy I would say :)  No long term damage at least.

 

I ended up deleting (moving to another disk) all the sstables which fixed the 
problem. I ran in to even more problems during repair (detailed in another 
recent email) but it seems to have worked regardless. Just to be safe, I am in 
the process of starting a ‘manual repair’ (copying SSTables from other nodes 
for this particular CF then restarting and running a cleanup + major 
compaction).

 

Any thoughts on what the root cause of this problem could be? It is somewhat 
worrying that a CF can randomly become corrupt bringing down the whole node. 
Cassandras handling of a corrupt CF (regardless of how rare an occurrence) is 
less than elegant. 

 

Dan

 

From: Aaron Morton [mailto:aa...@thelastpickle.com] 
Sent: January-25-11 16:03
To: user@cassandra.apache.org
Subject: Re: Errors During Compaction

 

Dan how did you go with this? More joy, less joy or a continuation of the 
current level of joy?

 

Aaron

 


On 24/01/2011, at 9:38 AM, Dan Hendry dan.hendry.j...@gmail.com wrote:

I have run into a strange problem and was hoping for suggestions on how to fix 
it (0.7.0). When compaction occurs on one node for what appears to be one 
specific column family, the following error pops up the Cassandra log. 
Compaction apparently fails and temp files don’t get cleaned up. After a while 
and what seems to be multiple failed compactions on the CF, the node runs out 
of disk space and crashes. Not sure if it is a related problem or a function of 
this being a heavily used column family but after failing to compact, 
compaction restarts on the same CF exacerbating the issue.

 

Problems with this specific node started earlier this weekend when it crashed 
with and OOM error. This is quite surprising since my memtable thresholds and 
GC settings have been tuned to run with quite a bit of overhead during normal 
operation (max heap usage usually = 10 GB on a 12 GB heap, average usage of 
6-8 GB). I could not find anything abnormal in the logs which would prompt an 
OOM.

 

I will look things over tomorrow and try to provide a bit more information on 
the problem but as a solution, I was going to wipe out all SSTables for this CF 
on this node and then run a repair. Far from ideal, is this a reasonable 
solution?

 

 

ERROR [CompactionExecutor:1] 2011-01-23 14:10:29,855 
AbstractCassandraDaemon.java (line 91) Fatal exception in thread 
Thread[CompactionExecutor:1,1,RMI Runtime]

java.io.IOError: java.io.EOFException: attempted to skip -1983579368 bytes but 
only skipped 0

at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:78)

at 
org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:178)

at 
org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:143)

at 
org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:135)

at 
org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:38)

at 
org.apache.commons.collections.iterators.CollatingIterator.set(CollatingIterator.java:284)

at 
org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:326)

at 
org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230)

at 
org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:68)

at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)

at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)

at 
org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183)

at 
org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94)

at 
org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:323)

at 
org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:122)

at 
org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:92)

at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)

at java.util.concurrent.FutureTask.run(FutureTask.java:138)

at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)

at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:662)

Caused by: java.io.EOFException: attempted to skip -1983579368 bytes but only 
skipped 0

at 
org.apache.cassandra.io.sstable.IndexHelper.skipBloomFilter(IndexHelper.java:52)

at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:69)

... 20 more

 

Dan Hendry

(403) 660-2297

 

No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 9.0.872 / Virus 

Re: get_range_slices getting deleted rows

2011-01-25 Thread Roshan Dawrani
No, checking the key will not do.

You will need to check if row.getColumnSlice().getColumns() is empty or not.
That's what I do and it works for me.

On Wed, Jan 26, 2011 at 4:53 AM, Nick Santini nick.sant...@kaseya.comwrote:

 thanks,
 so I need to check the returned slice for the key to verify that is a valid
 row and not a deleted one?

 Nicolas Santini



 On Wed, Jan 26, 2011 at 12:16 PM, Narendra Sharma 
 narendra.sha...@gmail.com wrote:

 Yes. See this http://wiki.apache.org/cassandra/FAQ#range_ghosts

 -Naren


 On Tue, Jan 25, 2011 at 2:59 PM, Nick Santini nick.sant...@kaseya.comwrote:

 Hi,
 I'm trying a test scenario where I create 100 rows in a CF, then
 use get_range_slices to get all the rows, and I get 100 rows, so far so good
 then after the test I delete the rows using remove but without a column
 or super column, this deletes the row, I can confirm that cos if I try to
 get it with get_slice using the key I get nothing

 but then if I do get_range_slice again, where the range goes between new
 byte[0] and new byte[0] (therefore returning everything), I still get the
 100 row keys

 is that expected to be?

 thanks

 Nicolas Santini






Re: Re-partitioning the cluster with nodetool: what's happening?

2011-01-25 Thread Aaron Morton
It can take a bit of thinking time for the nodes to work out what to stream, the bottom of this pagehttp://wiki.apache.org/cassandra/Streamingtalks about how to watch whats happening.If it does get stuck let us know.AaronOn 26 Jan, 2011,at 11:42 AM, buddhasystem potek...@bnl.gov wrote:
Correction -- what I meant to say that I do see announcements about streaming
in the output, but these are stuck at 0%.

-- 
View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Re-partitioning-the-cluster-with-nodetool-what-s-happening-tp5960843p5960851.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.


RE: the java client problem

2011-01-25 Thread Raoyixuan (Shandy)
I had find the loasschemafrom yaml  by jconsole,How to load the schema ?

From: Ashish [mailto:paliwalash...@gmail.com]
Sent: Friday, January 21, 2011 8:10 PM
To: user@cassandra.apache.org
Subject: Re: the java client problem

check cassandra-install-dir/conf/cassandra.yaml

start cassandra
connect via jconsole
find MBeans - org.apache.cassandra.db - 
StorageServicehttp://wiki.apache.org/cassandra/StorageService - Operations 
- loadSchemaFromYAML

load the schema
and then try the example again.

HTH
ashish

2011/1/21 raoyixuan (Shandy) raoyix...@huawei.commailto:raoyix...@huawei.com
Which schema is it?
From: Ashish [mailto:paliwalash...@gmail.commailto:paliwalash...@gmail.com]
Sent: Friday, January 21, 2011 7:57 PM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: the java client problem

you are missing the column family in your keyspace.

If you are using the default definitions of schema shipped with cassandra, 
ensure to load the schema from JMX.

thanks
ashish
2011/1/21 raoyixuan (Shandy) raoyix...@huawei.commailto:raoyix...@huawei.com
I exec the code as below by hector client:

package com.riptano.cassandra.hector.example;
import me.prettyprint.cassandra.serializers.StringSerializer;
import me.prettyprint.hector.api.Cluster;
import me.prettyprint.hector.api.Keyspace;
import me.prettyprint.hector.api.beans.HColumn;
import me.prettyprint.hector.api.exceptions.HectorException;
import me.prettyprint.hector.api.factory.HFactory;
import me.prettyprint.hector.api.mutation.Mutator;
import me.prettyprint.hector.api.query.ColumnQuery;
import me.prettyprint.hector.api.query.QueryResult;

public class InsertSingleColumn {
private static StringSerializer stringSerializer = StringSerializer.get();

public static void main(String[] args) throws Exception {
Cluster cluster = HFactory.getOrCreateCluster(TestCluster, 
*.*.*.*:9160);

Keyspace keyspaceOperator = HFactory.createKeyspace(Shandy, cluster);

try {
MutatorString mutator = HFactory.createMutator(keyspaceOperator, 
StringSerializer.get());
mutator.insert(jsmith, Standard1, 
HFactory.createStringColumn(first, John));

ColumnQueryString, String, String columnQuery = 
HFactory.createStringColumnQuery(keyspaceOperator);

columnQuery.setColumnFamily(Standard1).setKey(jsmith).setName(first);
QueryResultHColumnString, String result = columnQuery.execute();

System.out.println(Read HColumn from cassandra:  + result.get());
System.out.println(Verify on CLI with:  get 
Keyspace1.Standard1['jsmith'] );

} catch (HectorException e) {
e.printStackTrace();
}
cluster.getConnectionManager().shutdown();
}

}

And it shows the error :

me.prettyprint.hector.api.exceptions.HInvalidRequestException: 
InvalidRequestException(why:unconfigured columnfamily Standard1)
  at 
me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:42)
  at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:95)
  at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:88)
  at 
me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:89)
  at 
me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:142)
  at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:129)
  at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:100)
  at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:106)
  at 
me.prettyprint.cassandra.model.MutatorImpl$2.doInKeyspace(MutatorImpl.java:149)
  at 
me.prettyprint.cassandra.model.MutatorImpl$2.doInKeyspace(MutatorImpl.java:146)
  at 
me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
 at 
me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:65)
  at 
me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:146)
  at me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:55)
  at 
com.riptano.cassandra.hector.example.InsertSingleColumn.main(InsertSingleColumn.java:21)
Caused by: InvalidRequestException(why:unconfigured columnfamily Standard1)
  at 
org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:16477)
  at 
org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:916)
  at 
org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:890)
  at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:93)
  ... 13 more


华为技术有限公司 Huawei Technologies Co., Ltd.[Image removed by sender. Company_logo]



Phone: 28358610

Re: [mapreduce] ColumnFamilyRecordWriter hidden reuse

2011-01-25 Thread Patrik Modesto
On Tue, Jan 25, 2011 at 19:09, Mick Semb Wever m...@apache.org wrote:

 In fact i have another problem (trying to write an empty byte[], or
 something, as a key, which put one whole row out of whack, ((one row in
 25 million...))).

 But i'm debugging along the same code.

 I don't quite understand how the byte[] in
 ByteBuffer.wrap(key.getBytes(),...)
 gets clobbered.

Code snippet would help here.

 Well your key is a mutable Text object, so i can see some possibility
 depending on how hadoop uses these objects.
 Is there something to ByteBuffer.allocate(..) i'm missing...

I don't know, I'm quite new to Java (but with long C++ history).

 btw.
  is d.timestamp = System.currentTimeMillis(); ok?
  shouldn't this be microseconds so that each mutation has a different
 timestamp? http://wiki.apache.org/cassandra/DataModel

You are correct that microseconds would be better but for the test it
doesn't matter that much.

Patrik


Re: [mapreduce] ColumnFamilyRecordWriter hidden reuse

2011-01-25 Thread Mck

   is d.timestamp = System.currentTimeMillis(); ok?
 
 You are correct that microseconds would be better but for the test it
 doesn't matter that much. 

Have you tried. I'm very new to cassandra as well, and always uncertain
as to what to expect...


 ByteBuffer bbKey = ByteBufferUtil.clone(ByteBuffer.wrap(key.getBytes(), 0, 
 key.getLength())); 

An alternative approach to your client-side cloning is 

  ByteBuffer bbKey = ByteBuffer.wrap(key.toString().getBytes(UTF_8)); 

Here at least it is obvious you are passing in the bytes from an immutable 
object.

As far as moving the clone(..) into ColumnFamilyRecordWriter.write(..)
won't this hurt performance? Normally i would _always_ agree that a
defensive copy of an array/collection argument be stored, but has this
intentionally not been done (or should it) because of large reduce jobs
(millions of records) and the performance impact here.

The key isn't the only potential live byte[]. You also have names and
values in all the columns (and supercolumns) for all the mutations.


~mck