RE: How to store large columns?

2013-01-21 Thread Jason Brown
The reason for multiple keys (and, by extension, multiple columns) is to better 
distribute the write/read load across the cluster as keys will (hopefully) be 
distributed on different nodes. This helps to avoid hot spots.

Hope this helps,

-Jason Brown
Netflix

From: Sávio Teles [savio.te...@lupa.inf.ufg.br]
Sent: Monday, January 21, 2013 9:51 AM
To: user@cassandra.apache.org
Subject: Re: How to store large columns?

Astyanax split large objects into multiple keys. Is it a good idea? It is 
better to split into multiple columns?

Thanks

2013/1/21 Sávio Teles 
mailto:savio.te...@lupa.inf.ufg.br>>

Thanks Keith Wright.


2013/1/21 Keith Wright mailto:kwri...@nanigans.com>>
This may be helpful:  
https://github.com/Netflix/astyanax/wiki/Chunked-Object-Store

From: Vegard Berget mailto:p...@fantasista.no>>
Reply-To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>, Vegard Berget 
mailto:p...@fantasista.no>>
Date: Monday, January 21, 2013 8:35 AM
To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Subject: Re: How to store large columns?



Hi,

You could split it into multiple columns on the client side:
RowKeyData: Part1: [1mb], Part2: [1mb], Part3: [1mb]...PartN[1mb]

Now you can use multiple get() in parallell to get the files back and then join 
them back to one file.

I _think_ maybe the new CQL3-protocol does not have the same limitation, but I 
have never tried large columns there, so someone with more experience than me 
will have to confirm this.

.vegard,

- Original Message -
From:
user@cassandra.apache.org

To:
mailto:user@cassandra.apache.org>>
Cc:

Sent:
Mon, 21 Jan 2013 11:16:40 -0200
Subject:
How to store large columns?


We wish to store a column in a row with size larger than 
thrift_framed_transport_size_in_mb. But, Thrift has a maximum frame size 
configured by thrift_framed_transport_size_in_mb in cassandra.yaml.
so, How to store columns with size larger than 
thrift_framed_transport_size_in_mb? Increasing this value does not solve the 
problem, since we have columns with varying sizes.

--
Atenciosamente,
Sávio S. Teles de Oliveira
voice: +55 62 9136 6996
http://br.linkedin.com/in/savioteles
Mestrando em Ciências da Computação - UFG
Arquiteto de Software
Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG



--
Atenciosamente,
Sávio S. Teles de Oliveira
voice: +55 62 9136 6996
http://br.linkedin.com/in/savioteles
Mestrando em Ciências da Computação - UFG
Arquiteto de Software
Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG



--
Atenciosamente,
Sávio S. Teles de Oliveira
voice: +55 62 9136 6996
http://br.linkedin.com/in/savioteles
Mestrando em Ciências da Computação - UFG
Arquiteto de Software
Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG


RE: CQL3 Frame Length

2013-01-21 Thread Pierre Chalamet
Hi,
 
That's not a good reason imho. This would have been better to have chunks of
data (like in the good old IFF file format).
If the client is not able to read the chunk, just skip it. And frankly,
that's not a few more bytes that would have killed us.
 
As an example, request tracing was added pretty late and then, additional
data just landed not at the end as could have been anticipated, but before
the body of the frame. This could have been handled transparently with a
chunk format. And OK this was in rc2 and not in 1.2 so no regression
officially introduced. 
 
But well, it's v1 - there are still 0x7E more versions to get it better.
- Pierre
 
From: Theo Hultberg [mailto:t...@iconara.net] 
Sent: Saturday, January 19, 2013 6:33 PM
To: user@cassandra.apache.org
Subject: Re: CQL3 Frame Length
 
Hi,

Another reason for keeping the frame length in the header is that newer
versions can add fields to frames without older clients breaking. For
example a minor release can add some more content to an existing frame
without older clients breaking. If clients didn't know the full frame length
(and were required by the specification to consume all the bytes) there
would be trailing garbage which would most likely crash the client.
 
T#

> Hey Sylvain,>

> Thanks for explaining the rationale. When you look at from the perspective
> of the use cases you mention, it makes sense to be able to supply the
> reader with the frame size up front.>

> I've opted to go for serializing the frame into a buffer. Although this
> could materialize an arbitrarily large amount of memory, ultimately the
> driving application has control of the degree to which this can occur, so
> in the grander scheme of things, you can still maintain streaming
semantics.>

> Thanks for the heads up.>

> Cheers,>

> Ben>
>

> On Tue, Jan 8, 2013 at 4:08 PM, Sylvain Lebresne <
 sylv...@datastax.com>wrote:>

>> Mostly this is because having the frame length is convenient to have in
>> practice.
>>
>> Without pretending that there is only one way to write a server, it is
>> common
>> to separate the phase "read a frame from the network" from the phase
>> "decode
>> the frame" which is often simpler if you can read the frame upfront.
Also,
>> if
>> you don't have the frame size, it means you need to decode the whole
frame
>> before being able to decode the next one, and so you can't parallelize
the
>> decoding.
>>
>> It is true however that it means for the write side that you need to
>> either be
>> able to either pre-compute the frame body size or to serialize it in
memory
>> first. That's a trade of for making it easier on the read side. But if
you
>> want
>> my opinion, on the write side too it's probably worth parallelizing the
>> message
>> encoding (which require you encode it in memory first) since it's an
>> asynchronous protocol and so there will likely be multiple writer
>> simultaneously.
>>
>> --
>> Sylvain
>>
>>
>>
>> On Tue, Jan 8, 2013 at 12:48 PM, Ben Hood < 
0x6e6...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I've read the CQL wire specification and naively, I can't see how the
>>> frame length length header is used.
>>>
>>> To me, it looks like on the read side, you know which type of structures
>>> to expect based on the opcode and each structure is TLV encoded.
>>>
>>> On the write side, you need to encode TLV structures as well, but you
>>> don't know the overall frame length until you've encoded it. So it would
>>> seem that you either need to pre-calculate the cumulative TLV size
before
>>> you serialize the frame body, or you serialize the frame body to a
buffer
>>> which you can then get the size of and then write to the socket, after
>>> having first written the count out.
>>>
>>> Is there potentially an implicit assumption that the reader will want to
>>> pre-buffer the entire frame before decoding it?
>>>
>>> Cheers,
>>>
>>> Ben
>>>
>>
>>


Re: Cassandra Performance Benchmarking.

2013-01-21 Thread Pradeep Kumar Mantha
Hi,

Thanks for the information..

I upgraded my cassandra version to 1.2.0 and tried running the
experiment again to find the statistics.

My application took nearly 529 seconds for querying 76896 keys.

Please find the statistic information below for 32 threads ( where
each thread query 76896 keys ) obtained just after the experiment.

(mypython_repo)-bash-3.2$ nodetool -host XX.XX.XX.XX -p 7199 proxyhistograms
proxy histograms
Offset  Read Latency Write Latency Range Latency
1  0 0 0
2  0 0 0
3  0 0 0
4  0 0 0
5  0 0 0
6  0 0 0
7  0 0 0
8  0 0 0
10 0 0 0
12 0 0 0
14 0 0 0
17 0 0 0
20 0 0 0
24 0 0 0
29 0 0 0
35 0 0 0
42 0 0 0
50 0 0 0
60 0 0 0
72 0 0 0
86 0 0 0
1030 0 0
1240 0 0
1490 0 0
1790 0 0
2150 0 0
2580 0 0
3102 0 0
372  233 0 0
446 7291 0 0
535 9669 0 0
64234917 0 0
77073709 0 0
92445270 0 0
1109   18186 0 0
13316931 0 0
15972111 0 0
1916 661 0 0
2299 285 0 0
2759 123 0 0
3311  56 0 0
3973  47 0 0
4768  45 0 0
5722  42 0 0
6866  43 0 0
8239  60 0 0
9887  41 0 0
11864 42 0 0
14237 32 0 0
17084 50 0 0
20501 51 0 0
24601 55 0 0
29521 43 0 0
35425 26 0 0
42510 30 0 0
51012 37 0 0
61214 46 0 0
73457 60 0 0
88148106 0 0
105778   127 0 0
126934   168 0 0
152321   110 0 0
18278571 0 0
21934222 0 0
26321010 0 0
315852 2 0 0
379022 2 0 0
454826 5 0 0
545791 0 0 0
654949 

Re: sstable2json had random behavior

2013-01-21 Thread Binh Nguyen
Hi William,

I also saw this one before but it always happened in my case when I have
only Data and Index files. The problem goes away when I have all another
files (Compression, Filter...)


On Mon, Jan 21, 2013 at 11:36 AM, William Oberman
wrote:

> I'm running 1.1.6 from the datastax repo.
>
> I ran sstable2json and got the following error:
> Exception in thread "main" java.io.IOError: java.io.IOException: dataSize
> of 7020023552240793698 starting at 993981393 would be larger than file
> /var/lib/cassandra/data/X-Data.db length 7502161255
> at
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:156)
> at
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:86)
> at
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:70)
> at
> org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:187)
> at
> org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:151)
> at
> org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:143)
> at
> org.apache.cassandra.tools.SSTableExport.export(SSTableExport.java:309)
> at
> org.apache.cassandra.tools.SSTableExport.export(SSTableExport.java:340)
> at
> org.apache.cassandra.tools.SSTableExport.export(SSTableExport.java:353)
> at
> org.apache.cassandra.tools.SSTableExport.main(SSTableExport.java:418)
> Caused by: java.io.IOException: dataSize of 7020023552240793698 starting
> at 993981393 would be larger than file
> /var/lib/cassandra/data/X-Data.db length 7502161255
> at
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:115)
> ... 9 more
>
>
> I ran it again, and didn't.  This makes me worried :-)  Does anyone else
> ever see this class of error, and does it ever disappear for them?
>


sstable2json had random behavior

2013-01-21 Thread William Oberman
I'm running 1.1.6 from the datastax repo.

I ran sstable2json and got the following error:
Exception in thread "main" java.io.IOError: java.io.IOException: dataSize
of 7020023552240793698 starting at 993981393 would be larger than file
/var/lib/cassandra/data/X-Data.db length 7502161255
at
org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:156)
at
org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:86)
at
org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:70)
at
org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:187)
at
org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:151)
at
org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:143)
at
org.apache.cassandra.tools.SSTableExport.export(SSTableExport.java:309)
at
org.apache.cassandra.tools.SSTableExport.export(SSTableExport.java:340)
at
org.apache.cassandra.tools.SSTableExport.export(SSTableExport.java:353)
at
org.apache.cassandra.tools.SSTableExport.main(SSTableExport.java:418)
Caused by: java.io.IOException: dataSize of 7020023552240793698 starting at
993981393 would be larger than file /var/lib/cassandra/data/X-Data.db
length 7502161255
at
org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:115)
... 9 more


I ran it again, and didn't.  This makes me worried :-)  Does anyone else
ever see this class of error, and does it ever disappear for them?


Re: How to store large columns?

2013-01-21 Thread Vegard Berget
I think the main difference will be that by splitting on multiple rows, you 
will get the data evenly distributed on multiple nodes. On large data this is 
probably better. 

.vegard,

Sávio Teles :

>Astyanax split large objects into multiple keys. Is it a good idea? It
>is better
>to split into multiple columns?
>
>Thanks
>
>2013/1/21 Sávio Teles 
>
>>
>> Thanks Keith Wright.
>>
>>
>> 2013/1/21 Keith Wright 
>>
>>> This may be helpful:
>>> https://github.com/Netflix/astyanax/wiki/Chunked-Object-Store
>>>
>>> From: Vegard Berget 
>>> Reply-To: "user@cassandra.apache.org" ,
>>> Vegard Berget 
>>> Date: Monday, January 21, 2013 8:35 AM
>>> To: "user@cassandra.apache.org" 
>>> Subject: Re: How to store large columns?
>>>
>>>
>>>
>>> Hi,
>>>
>>> You could split it into multiple columns on the client side:
>>> RowKeyData: Part1: [1mb], Part2: [1mb], Part3: [1mb]...PartN[1mb]
>>>
>>> Now you can use multiple get() in parallell to get the files back and
>>> then join them back to one file.
>>>
>>> I _think_ maybe the new CQL3-protocol does not have the same limitation,
>>> but I have never tried large columns there, so someone with more experience
>>> than me will have to confirm this.
>>>
>>> .vegard,
>>>
>>>
>>> - Original Message -
>>> From:
>>> user@cassandra.apache.org
>>>
>>> To:
>>> 
>>> Cc:
>>>
>>> Sent:
>>> Mon, 21 Jan 2013 11:16:40 -0200
>>> Subject:
>>> How to store large columns?
>>>
>>>
>>> We wish to store a column in a row with size larger 
>>> thanthrift_framed_transport_size_in_mb
>>> . But, Thrift has a maximum frame size configured by
>>> thrift_framed_transport_size_in_mb in cassandra.yaml.
>>> so, How to store columns with size larger than
>>> thrift_framed_transport_size_in_mb? Increasing this value does not solve the
>>> problem, since we have columns with varying sizes.
>>>
>>> --
>>> Atenciosamente,
>>> Sávio S. Teles de Oliveira
>>> voice: +55 62 9136 6996
>>> http://br.linkedin.com/in/savioteles
>>>  Mestrando em Ciências da Computação - UFG
>>> Arquiteto de Software
>>> Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG
>>>
>>>
>>
>>
>> --
>> Atenciosamente,
>> Sávio S. Teles de Oliveira
>> voice: +55 62 9136 6996
>> http://br.linkedin.com/in/savioteles
>>  Mestrando em Ciências da Computação - UFG
>> Arquiteto de Software
>> Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG
>>
>
>
>
>-- 
>Atenciosamente,
>Sávio S. Teles de Oliveira
>voice: +55 62 9136 6996
>http://br.linkedin.com/in/savioteles
>Mestrando em Ciências da Computação - UFG
>Arquiteto de Software
>Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG


Re: How to store large columns?

2013-01-21 Thread Sávio Teles
Astyanax split large objects into multiple keys. Is it a good idea? It
is better
to split into multiple columns?

Thanks

2013/1/21 Sávio Teles 

>
> Thanks Keith Wright.
>
>
> 2013/1/21 Keith Wright 
>
>> This may be helpful:
>> https://github.com/Netflix/astyanax/wiki/Chunked-Object-Store
>>
>> From: Vegard Berget 
>> Reply-To: "user@cassandra.apache.org" ,
>> Vegard Berget 
>> Date: Monday, January 21, 2013 8:35 AM
>> To: "user@cassandra.apache.org" 
>> Subject: Re: How to store large columns?
>>
>>
>>
>> Hi,
>>
>> You could split it into multiple columns on the client side:
>> RowKeyData: Part1: [1mb], Part2: [1mb], Part3: [1mb]...PartN[1mb]
>>
>> Now you can use multiple get() in parallell to get the files back and
>> then join them back to one file.
>>
>> I _think_ maybe the new CQL3-protocol does not have the same limitation,
>> but I have never tried large columns there, so someone with more experience
>> than me will have to confirm this.
>>
>> .vegard,
>>
>>
>> - Original Message -
>> From:
>> user@cassandra.apache.org
>>
>> To:
>> 
>> Cc:
>>
>> Sent:
>> Mon, 21 Jan 2013 11:16:40 -0200
>> Subject:
>> How to store large columns?
>>
>>
>> We wish to store a column in a row with size larger 
>> thanthrift_framed_transport_size_in_mb
>> . But, Thrift has a maximum frame size configured by
>> thrift_framed_transport_size_in_mb in cassandra.yaml.
>> so, How to store columns with size larger than
>> thrift_framed_transport_size_in_mb? Increasing this value does not solve the
>> problem, since we have columns with varying sizes.
>>
>> --
>> Atenciosamente,
>> Sávio S. Teles de Oliveira
>> voice: +55 62 9136 6996
>> http://br.linkedin.com/in/savioteles
>>  Mestrando em Ciências da Computação - UFG
>> Arquiteto de Software
>> Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG
>>
>>
>
>
> --
> Atenciosamente,
> Sávio S. Teles de Oliveira
> voice: +55 62 9136 6996
> http://br.linkedin.com/in/savioteles
>  Mestrando em Ciências da Computação - UFG
> Arquiteto de Software
> Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG
>



-- 
Atenciosamente,
Sávio S. Teles de Oliveira
voice: +55 62 9136 6996
http://br.linkedin.com/in/savioteles
Mestrando em Ciências da Computação - UFG
Arquiteto de Software
Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG


RE: High Read and write through put

2013-01-21 Thread Viktor Jevdokimov
For such a generic question without technical details of requirements, the 
answer - use defaults.


Best regards / Pagarbiai
Viktor Jevdokimov
Senior Developer

Email: viktor.jevdoki...@adform.com
Phone: +370 5 212 3063, Fax +370 5 261 0453
J. Jasinskio 16C, LT-01112 Vilnius, Lithuania
Follow us on Twitter: @adforminsider
Take a ride with Adform's Rich Media Suite

[Adform News] 
[Adform awarded the Best Employer 2012] 



Disclaimer: The information contained in this message and attachments is 
intended solely for the attention and use of the named addressee and may be 
confidential. If you are not the intended recipient, you are reminded that the 
information remains the property of the sender. You must not use, disclose, 
distribute, copy, print or rely on this e-mail. If you have received this 
message in error, please contact the sender immediately and irrevocably delete 
this message and any copies.

From: Jay Svc [mailto:jaytechg...@gmail.com]
Sent: Monday, January 21, 2013 17:31
To: user@cassandra.apache.org
Subject: High Read and write through put


Folks,



For given situation I am expecting multiple read and write request to a same 
cluster. What are primary design or configuration consideration we should make?



Any thoughts or links to such documentation is appreciated.



Thanks,

Jay

<><>

RE: Concurrent write performance

2013-01-21 Thread Viktor Jevdokimov
Do you experience any performance problems?

This will be the last thing to look at.


Best regards / Pagarbiai
Viktor Jevdokimov
Senior Developer

Email: viktor.jevdoki...@adform.com
Phone: +370 5 212 3063, Fax +370 5 261 0453
J. Jasinskio 16C, LT-01112 Vilnius, Lithuania
Follow us on Twitter: @adforminsider
Take a ride with Adform's Rich Media Suite

[Adform News] 
[Adform awarded the Best Employer 2012] 



Disclaimer: The information contained in this message and attachments is 
intended solely for the attention and use of the named addressee and may be 
confidential. If you are not the intended recipient, you are reminded that the 
information remains the property of the sender. You must not use, disclose, 
distribute, copy, print or rely on this e-mail. If you have received this 
message in error, please contact the sender immediately and irrevocably delete 
this message and any copies.

From: Jay Svc [mailto:jaytechg...@gmail.com]
Sent: Monday, January 21, 2013 17:28
To: user@cassandra.apache.org
Subject: Concurrent write performance


Folks,



I would like to write(insert or update) to a single row in a column family. I 
have concurrent requests which will write to a single row. Do we see any 
performance implications because of concurrent writes to a single row where 
comparator has to sort the columns at the same time?



Please share your thoughts.



Thanks,

Jay
<><>

High Read and write through put

2013-01-21 Thread Jay Svc
Folks,


For given situation I am expecting multiple read and write request to a
same cluster. What are primary design or configuration consideration we
should make?


Any thoughts or links to such documentation is appreciated.


Thanks,

Jay


Concurrent write performance

2013-01-21 Thread Jay Svc
Folks,


I would like to write(insert or update) to a single row in a column family.
I have concurrent requests which will write to a single row. Do we see any
performance implications because of concurrent writes to a single row where
comparator has to sort the columns at the same time?


Please share your thoughts.


Thanks,

Jay


Re: How to store large columns?

2013-01-21 Thread Sávio Teles
Thanks Keith Wright.

2013/1/21 Keith Wright 

> This may be helpful:
> https://github.com/Netflix/astyanax/wiki/Chunked-Object-Store
>
> From: Vegard Berget 
> Reply-To: "user@cassandra.apache.org" , Vegard
> Berget 
> Date: Monday, January 21, 2013 8:35 AM
> To: "user@cassandra.apache.org" 
> Subject: Re: How to store large columns?
>
>
>
> Hi,
>
> You could split it into multiple columns on the client side:
> RowKeyData: Part1: [1mb], Part2: [1mb], Part3: [1mb]...PartN[1mb]
>
> Now you can use multiple get() in parallell to get the files back and then
> join them back to one file.
>
> I _think_ maybe the new CQL3-protocol does not have the same limitation,
> but I have never tried large columns there, so someone with more experience
> than me will have to confirm this.
>
> .vegard,
>
>
> - Original Message -
> From:
> user@cassandra.apache.org
>
> To:
> 
> Cc:
>
> Sent:
> Mon, 21 Jan 2013 11:16:40 -0200
> Subject:
> How to store large columns?
>
>
> We wish to store a column in a row with size larger 
> thanthrift_framed_transport_size_in_mb
> . But, Thrift has a maximum frame size configured by
> thrift_framed_transport_size_in_mb in cassandra.yaml.
> so, How to store columns with size larger than
> thrift_framed_transport_size_in_mb? Increasing this value does not solve the
> problem, since we have columns with varying sizes.
>
> --
> Atenciosamente,
> Sávio S. Teles de Oliveira
> voice: +55 62 9136 6996
> http://br.linkedin.com/in/savioteles
> Mestrando em Ciências da Computação - UFG
> Arquiteto de Software
> Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG
>
>


-- 
Atenciosamente,
Sávio S. Teles de Oliveira
voice: +55 62 9136 6996
http://br.linkedin.com/in/savioteles
Mestrando em Ciências da Computação - UFG
Arquiteto de Software
Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG


Re: How to store large columns?

2013-01-21 Thread Keith Wright
This may be helpful:  
https://github.com/Netflix/astyanax/wiki/Chunked-Object-Store

From: Vegard Berget mailto:p...@fantasista.no>>
Reply-To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>, Vegard Berget 
mailto:p...@fantasista.no>>
Date: Monday, January 21, 2013 8:35 AM
To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Subject: Re: How to store large columns?



Hi,

You could split it into multiple columns on the client side:
RowKeyData: Part1: [1mb], Part2: [1mb], Part3: [1mb]...PartN[1mb]

Now you can use multiple get() in parallell to get the files back and then join 
them back to one file.

I _think_ maybe the new CQL3-protocol does not have the same limitation, but I 
have never tried large columns there, so someone with more experience than me 
will have to confirm this.

.vegard,

- Original Message -
From:
user@cassandra.apache.org

To:
mailto:user@cassandra.apache.org>>
Cc:

Sent:
Mon, 21 Jan 2013 11:16:40 -0200
Subject:
How to store large columns?


We wish to store a column in a row with size larger than 
thrift_framed_transport_size_in_mb. But, Thrift has a maximum frame size 
configured by thrift_framed_transport_size_in_mb in cassandra.yaml.
so, How to store columns with size larger than 
thrift_framed_transport_size_in_mb? Increasing this value does not solve the 
problem, since we have columns with varying sizes.

--
Atenciosamente,
Sávio S. Teles de Oliveira
voice: +55 62 9136 6996
http://br.linkedin.com/in/savioteles
Mestrando em Ciências da Computação - UFG
Arquiteto de Software
Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG


Re: Cassandra timeout whereas it is not much busy

2013-01-21 Thread Nicolas Lalevée
Le 17 janv. 2013 à 05:00, aaron morton  a écrit :

> Check the disk utilisation using iostat -x 5
> If you are on a VM / in the cloud check for CPU steal. 
> Check the logs for messages from the GCInspector, the ParNew events are times 
> the JVM is paused. 

I have seen logs about that. I didn't worry much, since the GC of the jvm was 
not under pressure. As far as I understand, unless a CF is "continuously" 
flushed, it should not be a major issue, isn't it ?
I don't know for sure if there was a lot of flush though, since my nodes were 
not properly monitored.

> Look at the times dropped messages are logged and try to correlate them with 
> other server events.

I tried that with not much success. I have graphs on cacti though, so this is 
quite hard to visualize when things happen simultaneously on several graphs.

> If you have a lot secondary indexes, or a lot of memtables flushing at the 
> some time you may be blocking behind the global Switch Lock. If you use 
> secondary indexes make sure the memtable_flush_queue_size is set correctly, 
> see the comments in the yaml file.

I have no secondary indexes.

> If you have a lot of CF's flushing at the same time, and there are not 
> messages from the "MeteredFlusher", it may be the log segment is too big for 
> the number of CF's you have. When the segment needs to be recycled all dirty 
> CF's are flushed, if you have a lot of cf's this can result in blocking 
> around the switch lock. Trying reducing the commitlog_segment_size_in_mb so 
> that less CF's are flushed.

What is "a lot" ? We have 26 CF. 9 are barely used. 15 contains time series 
data (cassandra rocks with them) in which only 3 of them have from 1 to 10 read 
or writes per sec. 1 quite hot (200read/s) which is mainly used for its bloom 
filter (which "disksize" is about 1G). And 1 also hot used only for writes 
(which has the same big bloom filter, which I am about to remove since it is 
useless).

BTW, thanks for the pointers. I have not tried yet to put our nodes under 
pressure. But when I'll do, I'll look at those pointers closely.

Nicolas

> 
> Hope that helps
>  
> -
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 17/01/2013, at 10:30 AM, Nicolas Lalevée  
> wrote:
> 
>> Hi,
>> 
>> I have a strange behavior I am not able to understand.
>> 
>> I have 6 nodes with cassandra-1.0.12. Each nodes have 8G of RAM. I have a 
>> replication factor of 3.
>> 
>> ---
>> my story is maybe too long, trying shorter here, while saving what I wrote 
>> in case someone has patience to read my bad english ;)
>> 
>> I got under a situation where my cluster was generating a lot of timeouts on 
>> our frontend, whereas I could not see any major trouble on the internal 
>> stats. Actually cpu, read & write counts on the column families were quite 
>> low. A mess until I switched from java7 to java6 and forced the used of 
>> jamm. After the switch, cpu, read & write counts, were going up again, 
>> timeouts gone. I have seen this behavior while reducing the xmx too.
>> 
>> What could be blocking cassandra from utilizing the while resources of the 
>> machine ? Is there is metrics I didn't saw which could explain this ?
>> 
>> ---
>> Here is the long story.
>> 
>> When I first set my cluster up, I gave blindly 6G of heap to the cassandra 
>> nodes, thinking that more a java process has, the smoother it runs, while 
>> keeping some RAM to the disk cache. We got some new feature deployed, and 
>> things were going into hell, some machine up to 60% of wa. I give credit to 
>> cassandra because there was not that much timeout received on the web 
>> frontend, it was kind of slow but is was kind of working. With some 
>> optimizations, we reduced the pressure of the new feature, but it was still 
>> at 40%wa.
>> 
>> At that time I didn't have much monitoring, just heap and cpu. I read some 
>> article how to tune, and I learned that the disk cache is quite important 
>> because cassandra relies on it to be the read cache. So I have tried many 
>> xmx, and 3G seems of kind the lowest possible. So on 2 among 6 nodes, I have 
>> set 3,3G to xmx. Amazingly, I saw the wa down to 10%. Quite happy with that, 
>> I changed the xmx 3,3G on each node. But then things really went to hell, a 
>> lot of timeouts on the frontend. It was not working at all. So I rolled back.
>> 
>> After some time, probably because of the growing data of the new feature to 
>> a nominal size, things went again to very high %wa, and cassandra was not 
>> able to keep it up. So we kind of reverted the feature, the column family is 
>> still used but only by one thread on the frontend. The wa was reduced to 
>> 20%, but things continued to not properly working, from time to time, a 
>> bunch of timeout are raised on our frontend.
>> 
>> In the mean time, I took time to do some proper monitoring of cassandra: 
>> column family read & wr

Re: How to store large columns?

2013-01-21 Thread Vegard Berget
 

Hi,

You could split it into multiple columns on the client side:  
RowKeyData: Part1: [1mb], Part2: [1mb], Part3: [1mb]...PartN[1mb]

Now you can use multiple get() in parallell to get the files back and
then join them back to one file.

I _think_ maybe the new CQL3-protocol does not have the same
limitation, but I have never tried large columns there, so someone
with more experience than me will have to confirm this.

.vegard,
- Original Message -
From: user@cassandra.apache.org
To:
Cc:
Sent:Mon, 21 Jan 2013 11:16:40 -0200
Subject:How to store large columns?

We wish to store a column in a row with size larger than
thrift_framed_transport_size_in_mb. But, Thrift has a maximum frame
size configured by thrift_framed_transport_size_in_mb in
cassandra.yaml. 
so, How to store columns with size larger than
thrift_framed_transport_size_in_mb? Increasing this value does not
solve the problem, since we have columns with varying sizes.

-- 
Atenciosamente,
Sávio S. Teles de Oliveira
voice: +55 62 9136 6996
http://br.linkedin.com/in/savioteles [1]
 Mestrando em Ciências da Computação - UFG 
Arquiteto de Software
 Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG  

Links:
--
[1] http://br.linkedin.com/in/savioteles



How to store large columns?

2013-01-21 Thread Sávio Teles
We wish to store a column in a row with size larger
thanthrift_framed_transport_size_in_mb
. But, Thrift has a maximum frame size configured by
thrift_framed_transport_size_in_mb in cassandra.yaml.
so, How to store columns with size larger than
thrift_framed_transport_size_in_mb? Increasing this value does not solve the
problem, since we have columns with varying sizes.

-- 
Atenciosamente,
Sávio S. Teles de Oliveira
voice: +55 62 9136 6996
http://br.linkedin.com/in/savioteles
Mestrando em Ciências da Computação - UFG
Arquiteto de Software
Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG


Re: Efficiency between SimpleStrategy and NetworkTopologyStrategy

2013-01-21 Thread Francisco Sobral
Thanks!

Francisco Sobral.

On Jan 21, 2013, at 5:55 AM, aaron morton  wrote:

> Use the NetworkTopologyStrategy, it's the default and it saves a lot of 
> trouble later. 
> 
> There is no real performance difference between NTS and SS. The NTS uses the 
> information provided by the snitch, it does not perform any network access. 
> 
> Cheers
> 
> -
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 19/01/2013, at 7:25 AM, shashwat shriparv  
> wrote:
> 
>> Network topology comes in picture when you have huge data to travel... you 
>> need to consider the network aspects too...
>> 
>> ∞
>> Shashwat Shriparv
>> 
>> 
>> 
>> On Fri, Jan 18, 2013 at 11:48 PM, Francisco Sobral  
>> wrote:
>> Dear friends,
>> 
>> We have only one datacenter with 4 nodes and no plans to have more 
>> datacenters.
>> With respect to the replication strategy, in this case, will SimpleStrategy 
>> be more efficient than NetworkTopologyStrategy, since the the latter 
>> performs an additional search for different racks?
>> 
>> Best regards,
>> Francisco Sobral
>> 
> 



AW: Cassandra at Amazon AWS

2013-01-21 Thread Roland Gude
On a side note:
If you are going for priam AND you are using LeveledCompaction think carefully 
whether you need incremental backups. The s3 upload cost can be very high 
because Leveled Compaction tends to create a lot of files and each put request 
to s3 costs money. We had this setup in relatively small cluster of 4 nodes 
where the switch to leveledcompaction increased backup cost by 800 Euro a month.

Greetings
Roland

Von: Roland Gude [mailto:roland.g...@ez.no]
Gesendet: Freitag, 18. Januar 2013 09:23
An: user@cassandra.apache.org
Betreff: AW: Cassandra at Amazon AWS

Priam is good for backups but it is another complex (but very good) part to a 
software stack.
A simple solution is to do regular snapshots (via cron)
Compress them and put them into s3
On the s3 you can simply choose how many days the files are kept.

This can be done with a couple of lines of shellscript. And a simple crontab 
entry

Von: Marcelo Elias Del Valle [mailto:mvall...@gmail.com]
Gesendet: Freitag, 18. Januar 2013 04:53
An: user@cassandra.apache.org
Betreff: Re: Cassandra at Amazon AWS

Everyone, thanks a lot for the answer, they helped me a lot.

2013/1/17 Andrey Ilinykh mailto:ailin...@gmail.com>>
I'd recommend Priam.

http://techblog.netflix.com/2012/02/announcing-priam.html

Andrey

On Thu, Jan 17, 2013 at 5:44 AM, Adam Venturella 
mailto:aventure...@gmail.com>> wrote:
Jared, how do you guys handle data backups for your ephemeral based cluster?

I'm trying to move to ephemeral drives myself, and that was my last sticking 
point; asking how others in the community deal with backup in case the VM 
explodes.


On Wed, Jan 16, 2013 at 1:21 PM, Jared Biel 
mailto:jared.b...@bolderthinking.com>> wrote:
We're currently using Cassandra on EC2 at very low scale (a 2 node
cluster on m1.large instances in two regions.) I don't believe that
EBS is recommended for performance reasons. Also, it's proven to be
very unreliable in the past (most of the big/notable AWS outages were
due to EBS issues.) We've moved 99% of our instances off of EBS.

As other have said, if you require more space in the future it's easy
to add more nodes to the cluster. I've found this page
(http://www.ec2instances.info/) very useful in determining the amount
of space each instance type has. Note that by default only one
ephemeral drive is attached and you must specify all ephemeral drives
that you want to use at launch time. Also, you can create a RAID 0 of
all local disks to provide maximum speed and space.


On 16 January 2013 20:42, Marcelo Elias Del Valle 
mailto:mvall...@gmail.com>> wrote:
> Hello,
>
>I am currently using hadoop + cassandra at amazon AWS. Cassandra runs on
> EC2 and my hadoop process runs at EMR. For cassandra storage, I am using
> local EC2 EBS disks.
>My system is running fine for my tests, but to me it's not a good setup
> for production. I need my system to perform well for specially for writes on
> cassandra, but the amount of data could grow really big, taking several Tb
> of total storage.
> My first guess was using S3 as a storage and I saw this can be done by
> using Cloudian package, but I wouldn't like to become dependent on a
> pre-package solution and I found it's kind of expensive for more than 100Tb:
> http://www.cloudian.com/pricing.html
> I saw some discussion at internet about using EBS or ephemeral disks for
> storage at Amazon too.
>
> My question is: does someone on this list have the same problem as me?
> What are you using as solution to Cassandra's storage when running it at
> Amazon AWS?
>
> Any thoughts would be highly appreciatted.
>
> Best regards,
> --
> Marcelo Elias Del Valle
> http://mvalle.com - @mvallebr





--
Marcelo Elias Del Valle
http://mvalle.com - @mvallebr


Re: Cassandra pending compaction tasks keeps increasing

2013-01-21 Thread aaron morton
The main guarantee LCS gives you is that most reads will only touch 1 row 
http://www.datastax.com/dev/blog/when-to-use-leveled-compaction

If compaction is falling behind this may not hold.

nodetool cfhistograms tells you how many SSTables were read from for reads.  
It's a recent histogram that resets each time you read from it. 

Also, parallel levelled compaction in 1.2 
http://www.datastax.com/dev/blog/performance-improvements-in-cassandra-1-2

Cheers

-
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 20/01/2013, at 7:49 AM, Jim Cistaro  wrote:

> 1) In addition to iostat, dstat is a good tool to see wht kind of disck 
> throuput your are getting.  That would be one thing to monitor.
> 2) For LCS, we also see pending compactions skyrocket.  During load, LCS will 
> create a lot of small sstables which will queue up for compaction.
> 3) For us the biggest concern is not how high the pending count gets, but how 
> often it gets back down near zero.  If your load is something you can do in 
> segments or pause, then you can see how fast the cluster recovers on the 
> compactions.
> 4) One thing which we tune per cluster is the size of the files.  Increasing 
> this from 5MB can sometimes improve things.  But I forget if we have ever 
> changed this after starting data load.
> 
> Is your cluster receiving read traffic during this data migration? If so, I 
> would say that read latency is your best measure.  If the high number of 
> SSTables waiting to compact is not hurting your reads, then you are probably 
> ok.  Since you are on SSD, there is a good chance the compactions are not 
> hurting you.  As for compactionthroughput, we set ours high for SSD.  You 
> usually wont use it all because the compactions are usually single threaded.  
> Dstat will help you measure this.
> 
> I hope this helps,
> jc
> 
> From: Wei Zhu 
> Reply-To: "user@cassandra.apache.org" , Wei Zhu 
> 
> Date: Friday, January 18, 2013 12:10 PM
> To: Cassandr usergroup 
> Subject: Cassandra pending compaction tasks keeps increasing
> 
> Hi,
> When I run nodetool compactionstats
> 
> I see the number of pending tasks keep going up steadily. 
> 
> I tried to increase the  compactionthroughput, by using
> 
> nodetool setcompactionthroughput
> 
> I even tried the extreme to set it to 0 to disable the throttling. 
> 
> I checked iostats and we have SSD for data, the disk util is less than 5% 
> which means it's not I/O bound, CPU is also less than 10%
> 
> We are using levelcompaction and in the process of migrating data. We have 
> 4500 writes per second and very few reads. We have about 70G data now and 
> will grow to 150G when the migration finishes. We only have one CF and right 
> now the number of  SSTable is around 15000, write latency is still under 
> 0.1ms. 
> 
> Anything needs to be concerned? Or anything I can do to reduce the number of 
> pending compaction?
> 
> Thanks.
> -Wei
> 
> 



Re: Cassandra Performance Benchmarking.

2013-01-21 Thread aaron morton
You can also see what it looks like from the server side. 

nodetool proxyhistograms will show you full request latency recorded by the 
coordinator. 
nodetool cfhistograms will show you the local read latency, this is just the 
time it takes to read data on a replica and does not include network or wait 
times. 

If the proxyhistograms is showing most requests running faster than your app 
says it's your app. 

Cheers

-
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 19/01/2013, at 8:16 AM, Tyler Hobbs  wrote:

> The fact that it's still exactly 521 seconds is very suspicious.  I can't 
> debug your script over the mailing list, but do some sanity checks to make 
> sure there's not a bottleneck somewhere you don't expect.
> 
> 
> On Fri, Jan 18, 2013 at 12:44 PM, Pradeep Kumar Mantha  
> wrote:
> Hi,
> 
> Thanks Tyler.
> 
> Below is the *global* connection pool I am trying to use, where the
> server_list contains all the ips of 12 DataNodes I am using and
> pool_size is the number of threads  and I just set to timeout to 60 to
> avoid connection retry errors.
> 
> pool = pycassa.ConnectionPool('Blast',
> server_list=server_list,pool_size=32,timeout=60)
> 
> 
> It seems the performance is still stuck at 521 seconds.. which is 177
> seconds for cassandra-cli.
> 
> Am I still missing something?
> 
> thanks
> Pradeep
> 
> 
> 
> On Fri, Jan 18, 2013 at 7:12 AM, Tyler Hobbs  wrote:
> > You just need to increase the ConnectionPool size to handle the number of
> > threads you have using it concurrently.  Set the pool_size kwarg to at least
> > the number of threads you're using.
> >
> >
> > On Thu, Jan 17, 2013 at 6:46 PM, Pradeep Kumar Mantha 
> > wrote:
> >>
> >> Thanks Tyler.
> >>
> >> I just moved the pool and cf which store the connection pool and CF
> >> information to have global scope.
> >>
> >> Increased the server_list values from 1 to 4. ( i think i can increase
> >> them max to 12 since I have 12 data nodes )
> >>
> >> when I created 8 threads  using python threading package , I see the
> >> below error.
> >>
> >> Exception in thread Thread-3:
> >> Traceback (most recent call last):
> >>   File
> >> "/usr/common/usg/python/2.7.1-20110310/lib64/python2.7/threading.py",
> >> line 530, in __bootstrap_inner
> >> self.run()
> >>   File "my_cc.py", line 20, in run
> >> start_cassandra_client(self.name)
> >>   File "my_cc.py", line 33, in start_cassandra_client
> >> cf.get(key)
> >>   File
> >> "/global/homes/p/pmantha/mypython_repo/lib/python2.7/site-packages/pycassa/columnfamily.py",
> >> line 652, in get
> >> read_consistency_level or self.read_consistency_level)
> >>   File
> >> "/global/homes/p/pmantha/mypython_repo/lib/python2.7/site-packages/pycassa/pool.py",
> >> line 553, in execute
> >> conn = self.get()
> >>   File
> >> "/global/homes/p/pmantha/mypython_repo/lib/python2.7/site-packages/pycassa/pool.py",
> >> line 536, in get
> >> raise NoConnectionAvailable(message)
> >> NoConnectionAvailable: ConnectionPool limit of size 5 reached, unable
> >> to obtain connection after 30 seconds
> >>
> >>
> >> Please have a look at the script attached.. and let me know if I need
> >> to change something.. Please bear with me, if I do something terribly
> >> wrong..
> >>
> >> I am running the script on a 8 processor node.
> >>
> >> thanks
> >> pradeep
> >>
> >> On Thu, Jan 17, 2013 at 4:18 PM, Tyler Hobbs  wrote:
> >> > ConnectionPools and ColumnFamilies are thread-safe in pycassa, and it's
> >> > best
> >> > to share them across multiple threads.  Of course, when you do that,
> >> > make
> >> > sure to make the ConnectionPool large enough to support all of the
> >> > threads
> >> > making queries concurrently.  I'm also not sure if you're just omitting
> >> > this, but pycassa's ConnectionPool will only open connections to servers
> >> > you
> >> > explicitly include in server_list; there's no autodiscovery of other
> >> > nodes
> >> > going on.
> >> >
> >> > Depending on your network latency, you'll top out on python performance
> >> > with
> >> > a fairly low number of threads due to the GIL.  It's best to use
> >> > multiple
> >> > processes if you really want to benchmark something.
> >> >
> >> >
> >> > On Thu, Jan 17, 2013 at 6:05 PM, Pradeep Kumar Mantha
> >> > 
> >> > wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> Thanks. I would like to benchmark cassandra with our application so
> >> >> that we understand the details of how the actual benchmarking is done.
> >> >> Not sure, how easy it would be to integrate YCSB with our application.
> >> >>
> >> >> So, i am trying different client interfaces to cassandra.
> >> >>
> >> >> I found
> >> >>
> >> >> for 12 Data Nodes Cassandra cluster and 1 Client Node which run 32
> >> >> threads ( each querying X number of queries ).
> >> >>
> >> >> cassandra-cli took 133 seconds
> >> >> pycassa took 521 seconds.
> >> >>
> >> >> Here is the python pycassa code used to query and