Re: Cassandra (2.0.4) pagination and total records?

2014-03-18 Thread DuyHai Doan
With Cassandra 2.0.x and Java driver 2.0.0 you can set the fetch size on
the query and then use Resulset.iterator(). It will iterate over your data
set by loading batches of size = fetch size
Le 18 mars 2014 01:39, Philip G g...@gpcentre.net a écrit :

 Thanks for the links.

 As I'm messing around with CQL, I'm realizing Cassandra isn't going to do
 what I need. Quite simply, here's a basic layout of my table:

 myTable (
 visit_dt timestamp,
 cid ascii,
 company text,
 // ... other stuff
primary key (visit_dt, cid)
 );
 index on (company)

 My query starts off with visit_dt IN ('2014-01-17'). In Cassandra, I
 essentially get back just 1 wide row (but shows as many within CQL3). I can
 filter that via AND company='my company' due to the index. However, if I
 LIMIT 10; there isn't a way to get the next 10 records as token() only
 works on the partition key, and each row has the same partition key.

 Or am I missing something? Is there a way I've not discovered to get the
 next 10 on a single wide row?


 ---
 Philip
 g...@gpcentre.net
 http://www.gpcentre.net/


 On Mon, Mar 17, 2014 at 5:12 PM, Tupshin Harper tups...@tupshin.comwrote:

 Read the automatic paging portion of this post :
 http://www.datastax.com/dev/blog/client-side-improvements-in-cassandra-2-0
 On Mar 17, 2014 8:09 PM, Philip G g...@gpcentre.net wrote:

 On Mon, Mar 17, 2014 at 4:54 PM, Robert Coli rc...@eventbrite.comwrote:

 The form of your question suggests you are Doing It Wrong, FWIW.



 Okay, let me ask different question: how do you go about data browsing
 in a CQL3 table? Especially in situations were a single query could return
 a couple thousand records, and we want to limit it by a 100 at a time.

 Please, feel free to point me in the right direction, if necessary. I
 admit I'm still figuring out Cassandra/CQL. But my knowledge has been
 exponentially expanding on a daily basis. I want to understand this more,
 and possible solution to problems I'm running into migrating from a RDBMS
 (mssql) to Cassandra. I've figured out a lot of stuff, but have not quite
 resolved this use-case.

 Thanks,

 ---
 Philip
 g...@gpcentre.net
 http://www.gpcentre.net/





Re: Cassandra DSC 2.0.5 not starting - * could not access pidfile for Cassandra

2014-03-18 Thread user 01
For others who get stuck with this in future, here is fix that worked for
me:

http://askubuntu.com/questions/435749/cant-start-an-application-as-service-but-running-as-standalone-process-simply


On Tue, Mar 11, 2014 at 9:24 PM, Michael Shuler mich...@pbandjelly.orgwrote:

 On 03/11/2014 08:47 AM, Ken Hancock wrote:

 See http://www.rudder-project.org/redmine/issues/2941 for the mess that
 has been created regarding java JRE dependencies.


 That's a good example of the cluster.. so many thanks to the Oracle legal
 department for disallowing redistribution..

 FWIW, I maintain my own apt repository for oracle-java7-{jre,jdk,etc}
 packages. If you want to build proper deb packages for Debian/Ubuntu (I'm
 sure there's a similar project for RPM-based distros) (there is also some
 sort of Ubuntu PPA that does a similar build, but I don't do random PPAs):

  git clone https://github.com/rraptorr/oracle-java7.git
  cd oracle-java7
  sh ./prepare.sh
  dpkg-buildpackage -uc -us

 and install the packages in that you want (just the jre, the jdk if you
 are building/testing, etc.)

 Throw 'em in an apt repo for installation on many machines  :)

 --
 Michael



Re: Relation between Atomic Batches and Consistency Level

2014-03-18 Thread Jonathan Lacefield
Okay your question is clear to me know.

My understanding, after talking this through with some of the engineers
here, is that we have 2 levels of success with batches:

1)  Did the batch make it to the batch log table? [yes or no]
- yes = success
- no = not success
2)  Did each statement in the batch succeed? [yes or no]
  - yes = success
  - no = not success
  - the case you are interested in.

If 1 and 2 are both successful - you will receive a success message
if 1 is successful but 2 is not successful (your case) - you will receive a
message stating the batch succeeded but not all replicas are live yet
  - in this case, the batch will be retried by Cassandra.  This is the
target scenario for atomic batches (to take the burden off of the client
app to monitor, maintain, and retry batches)
  - i am going to test this, was shooting for last night but didn't get
to it, to see what actually happens inside the batch
  - you could test this scenario with a trace to see what occurs (i.e.
if statement 1 fails is statement 2 tried)
if 1 is not successful then the batch fails
 - this is because it couldn't make it to the batchlog table for
execution

Hope this helps.  I believe this is the best i can do for you at the
moment.

Thanks,

Jonathan Lacefield
Solutions Architect, DataStax
(404) 822 3487
http://www.linkedin.com/in/jlacefield


http://www.datastax.com/what-we-offer/products-services/training/virtual-training


On Mon, Mar 17, 2014 at 4:05 PM, Drew Kutcharian d...@venarc.com wrote:

 I have read that blog post which actually was the source of the initial
 confusion ;)

 If I write normally (no batch) at Quorum, then a hinted write wouldn't
 count as a valid write so the write wouldn't succeed, which means I would
 have to retry. That's a pretty well defined outcome.

 Now if I write a logged batch at Quorum, then a by definition, a hinted
 write shouldn't be considered a valid response, no?

 - Drew


 On Mar 17, 2014, at 11:23 AM, Jonathan Lacefield jlacefi...@datastax.com
 wrote:

 Hello,

   Have you seen this blog post, it's old but still relevant.  I think it
 will answer your questions.
 http://www.datastax.com/dev/blog/atomic-batches-in-cassandra-1-2.

   I think the answer lies in how Cassandra defines a batch In the context
 of a Cassandra batch operation, atomic means that if any of the 
 batchsucceeds, all of it will.

   My understanding is that in your scenario if either statement succeeded,
 you batch would succeed.  So #1 would get hinted and #2 would be applied,
 assuming no other failure events occur, like the coordinator fails, the
 client fails, etc.

   Hope that helps.

 Thanks,

 Jonathan

 Jonathan Lacefield
 Solutions Architect, DataStax
 (404) 822 3487
 http://www.linkedin.com/in/jlacefield


 http://www.datastax.com/what-we-offer/products-services/training/virtual-training


 On Mon, Mar 17, 2014 at 1:38 PM, Drew Kutcharian d...@venarc.com wrote:

 Hi Jonathan,

 I'm still a bit unclear on this. Say I have two CQL3 tables:
 - user (replication of 3)
 - user_email_index (replication of 3)

 Now I create a new logged batch at quorum consistency level and put two
 inserts in there:
 #1 Insert into the user table with partition key of a timeuuid of the
 user
 #2 Insert into the user_email_index with partition key of user's email
 address

 As you can see, there is a chance that these two insert statements will
 be executed on two different nodes because they are keyed by different
 partition keys. So based on the docs for Logged Batches, a batch will be
 applied eventually in an all or nothing fashion. So my question is,
 what happens if insert #1 fails (say replicas are unavailable), would
 insert #2 get applied? Would the whole thing be rejected and return an
 error to the client?

 PS. I'm aware of the isolation guarantees and that's not an issue. All I
 need to make sure is that if the first the statement failed, the whole
 batch needs to fail.

 Thanks,

 Drew

 On Mar 17, 2014, at 5:33 AM, Jonathan Lacefield jlacefi...@datastax.com
 wrote:

 Hello,

   Consistency is declared at the statement level, i.e. batch level when
 writing, but enforced at each batch row level.  My understanding is that
 each batch (and all of it's contents) will be controlled through a specific
 CL declaration.  So batch A could use a CL of QUORUM while batch B could
 use a CL of ONE.

   The detail that may help sort this out for you is that batch statements
 do not provide isolation guarantees:
 www.datastax.com/documentation/cql/3.0/cql/cql_reference/batch_r.html.
  This means that you write the batch as a batch but the reads are per row.
  If you are reading records contained in the batch, you will read results
 of partially updated batches.  Taking this into account for your second
 question, you should expect that your read CL will preform as it would for
 any individual row mutation.

   Hope this helps.

 Jonathan

 Jonathan Lacefield
 Solutions 

Re: Problems with adding datacenter and schema version disagreement

2014-03-18 Thread olek.stas...@gmail.com
Ok, i've dropped all system keyspaces, rebuild cluster and recover
schema, now everything looks ok.
But main goal of operations was to add new datacenter to cluster.
After starting node in new cluster two schema versions appear, one
version is held by 6 nodes of first datacenter, second one is in newly
added node in new datacenter. Sth like this:
nodetool status
Datacenter: datacenter1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  AddressLoad   Tokens  Owns   Host ID
Rack
UN  192.168.1.1  50.19 GB   1   0,5%
c9323f38-d9c4-4a69-96e3-76cd4e1a204e  rack1
UN  192.168.1.2  54.83 GB   1   0,3%
ad1de2a9-2149-4f4a-aec6-5087d9d3acbb  rack1
UN  192.168.1.3  51.14 GB   1   0,6%
0ceef523-93fe-4684-ba4b-4383106fe3d1  rack1
UN  192.168.1.4  54.31 GB   1   0,7%
39d15471-456d-44da-bdc8-221f3c212c78  rack1
UN  192.168.1.5  53.36 GB   1   0,3%
7fed25a5-e018-43df-b234-47c2f118879b  rack1
UN  192.168.1.6  39.89 GB   1   0,1%
9f54fad6-949a-4fa9-80da-87efd62f3260  rack1
Datacenter: DC1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  AddressLoad   Tokens  Owns   Host ID
Rack
UN  192.168.1.7  100.77 KB  256 97,4%
ddb1f913-d075-4840-9665-3ba64eda0558  RAC1

describe cluster;
Cluster Information:
   Name: Metadata Cluster
   Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
   Partitioner: org.apache.cassandra.dht.RandomPartitioner
   Schema versions:
8fe34841-4f2a-3c05-97f2-15dd413d71dc: [192.168.1.7]

4ad381b6-df5a-3cbc-ba5a-0234b74d2383: [192.168.1.1, 192.168.1.2,
192.168.1.3, 192.168.1.4, 192.168.1.5, 192.168.1.6]

All keyspaces are now configured to keep data in datacenter1.
I assume, that It's not correct behaviour, is it true?
Could you help me, how can I safely add new DC to the cluster?

Regards
Aleksander


2014-03-14 18:28 GMT+01:00 olek.stas...@gmail.com olek.stas...@gmail.com:
 Ok, I'll do this during the weekend, I'll give you a feedback on Monday.
 Regards
 Aleksander

 14 mar 2014 18:15 Robert Coli rc...@eventbrite.com napisał(a):

 On Fri, Mar 14, 2014 at 12:40 AM, olek.stas...@gmail.com
 olek.stas...@gmail.com wrote:

 OK, I see, so the data files stay in place, i have to just stop
 cassandra on whole cluster, remove system schema and then start
 cluster and recreate all keyspaces with all column families? Data will
 be than loaded automatically from existing ssstables, right?


 Right. If you have clients reading while loading the schema, they may get
 exceptions.


 So one more question: what about KS system_traces? should it be
 removed and recreted? What data it's holding?


 It's holding data about tracing, a profiling feature. It's safe to nuke.

 =Rob



Re: Problems with adding datacenter and schema version disagreement

2014-03-18 Thread olek.stas...@gmail.com
Oh, one more question: what should be configuration for storing
system_traces keyspace? Should it be replicated or stored locally?
Regards
Olek

2014-03-18 16:47 GMT+01:00 olek.stas...@gmail.com olek.stas...@gmail.com:
 Ok, i've dropped all system keyspaces, rebuild cluster and recover
 schema, now everything looks ok.
 But main goal of operations was to add new datacenter to cluster.
 After starting node in new cluster two schema versions appear, one
 version is held by 6 nodes of first datacenter, second one is in newly
 added node in new datacenter. Sth like this:
 nodetool status
 Datacenter: datacenter1
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  AddressLoad   Tokens  Owns   Host ID
 Rack
 UN  192.168.1.1  50.19 GB   1   0,5%
 c9323f38-d9c4-4a69-96e3-76cd4e1a204e  rack1
 UN  192.168.1.2  54.83 GB   1   0,3%
 ad1de2a9-2149-4f4a-aec6-5087d9d3acbb  rack1
 UN  192.168.1.3  51.14 GB   1   0,6%
 0ceef523-93fe-4684-ba4b-4383106fe3d1  rack1
 UN  192.168.1.4  54.31 GB   1   0,7%
 39d15471-456d-44da-bdc8-221f3c212c78  rack1
 UN  192.168.1.5  53.36 GB   1   0,3%
 7fed25a5-e018-43df-b234-47c2f118879b  rack1
 UN  192.168.1.6  39.89 GB   1   0,1%
 9f54fad6-949a-4fa9-80da-87efd62f3260  rack1
 Datacenter: DC1
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  AddressLoad   Tokens  Owns   Host ID
 Rack
 UN  192.168.1.7  100.77 KB  256 97,4%
 ddb1f913-d075-4840-9665-3ba64eda0558  RAC1

 describe cluster;
 Cluster Information:
Name: Metadata Cluster
Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions:
 8fe34841-4f2a-3c05-97f2-15dd413d71dc: [192.168.1.7]

 4ad381b6-df5a-3cbc-ba5a-0234b74d2383: [192.168.1.1, 192.168.1.2,
 192.168.1.3, 192.168.1.4, 192.168.1.5, 192.168.1.6]

 All keyspaces are now configured to keep data in datacenter1.
 I assume, that It's not correct behaviour, is it true?
 Could you help me, how can I safely add new DC to the cluster?

 Regards
 Aleksander


 2014-03-14 18:28 GMT+01:00 olek.stas...@gmail.com olek.stas...@gmail.com:
 Ok, I'll do this during the weekend, I'll give you a feedback on Monday.
 Regards
 Aleksander

 14 mar 2014 18:15 Robert Coli rc...@eventbrite.com napisał(a):

 On Fri, Mar 14, 2014 at 12:40 AM, olek.stas...@gmail.com
 olek.stas...@gmail.com wrote:

 OK, I see, so the data files stay in place, i have to just stop
 cassandra on whole cluster, remove system schema and then start
 cluster and recreate all keyspaces with all column families? Data will
 be than loaded automatically from existing ssstables, right?


 Right. If you have clients reading while loading the schema, they may get
 exceptions.


 So one more question: what about KS system_traces? should it be
 removed and recreted? What data it's holding?


 It's holding data about tracing, a profiling feature. It's safe to nuke.

 =Rob



Re: Relation between Atomic Batches and Consistency Level

2014-03-18 Thread Drew Kutcharian
Alright, this is much better. The main thing I’m trying to figure out is that 
if there is a way to stop the batch if the first statement fails or there is a 
better pattern/construct for Cassandra to handle that scenario.

- Drew

On Mar 18, 2014, at 4:46 AM, Jonathan Lacefield jlacefi...@datastax.com wrote:

 Okay your question is clear to me know.  
 
 My understanding, after talking this through with some of the engineers here, 
 is that we have 2 levels of success with batches:
 
 1)  Did the batch make it to the batch log table? [yes or no]
 - yes = success
 - no = not success
 2)  Did each statement in the batch succeed? [yes or no]
   - yes = success
   - no = not success
   - the case you are interested in.
 
 If 1 and 2 are both successful - you will receive a success message
 if 1 is successful but 2 is not successful (your case) - you will receive a 
 message stating the batch succeeded but not all replicas are live yet
   - in this case, the batch will be retried by Cassandra.  This is the 
 target scenario for atomic batches (to take the burden off of the client app 
 to monitor, maintain, and retry batches)
   - i am going to test this, was shooting for last night but didn't get 
 to it, to see what actually happens inside the batch
   - you could test this scenario with a trace to see what occurs (i.e. if 
 statement 1 fails is statement 2 tried)
 if 1 is not successful then the batch fails
  - this is because it couldn't make it to the batchlog table for execution
 
 Hope this helps.  I believe this is the best i can do for you at the moment.  
 
 Thanks,
 
 Jonathan Lacefield
 Solutions Architect, DataStax
 (404) 822 3487
 
 
 
 
 
 
 On Mon, Mar 17, 2014 at 4:05 PM, Drew Kutcharian d...@venarc.com wrote:
 I have read that blog post which actually was the source of the initial 
 confusion ;)
 
 If I write normally (no batch) at Quorum, then a hinted write wouldn’t count 
 as a valid write so the write wouldn’t succeed, which means I would have to 
 retry. That’s a pretty well defined outcome.
 
 Now if I write a logged batch at Quorum, then a by definition, a hinted write 
 shouldn’t be considered a valid response, no?
 
 - Drew
 
 
 On Mar 17, 2014, at 11:23 AM, Jonathan Lacefield jlacefi...@datastax.com 
 wrote:
 
 Hello,
 
   Have you seen this blog post, it's old but still relevant.  I think it 
 will answer your questions.  
 http://www.datastax.com/dev/blog/atomic-batches-in-cassandra-1-2.
 
   I think the answer lies in how Cassandra defines a batch In the context 
 of a Cassandra batch operation, atomic means that if any of the batch 
 succeeds, all of it will.
 
   My understanding is that in your scenario if either statement succeeded, 
 you batch would succeed.  So #1 would get hinted and #2 would be applied, 
 assuming no other failure events occur, like the coordinator fails, the 
 client fails, etc.
 
   Hope that helps.
 
 Thanks,
 
 Jonathan
 
 Jonathan Lacefield
 Solutions Architect, DataStax
 (404) 822 3487
 
 
 
 
 
 
 On Mon, Mar 17, 2014 at 1:38 PM, Drew Kutcharian d...@venarc.com wrote:
 Hi Jonathan,
 
 I’m still a bit unclear on this. Say I have two CQL3 tables:
 - user (replication of 3)
 - user_email_index (replication of 3)
 
 Now I create a new logged batch at quorum consistency level and put two 
 inserts in there:
 #1 Insert into the “user table with partition key of a timeuuid of the user 
 #2 Insert into the “user_email_index with partition key of user’s email 
 address 
 
 As you can see, there is a chance that these two insert statements will be 
 executed on two different nodes because they are keyed by different 
 partition keys. So based on the docs for Logged Batches, a batch will be 
 applied “eventually” in an all or nothing” fashion. So my question is, what 
 happens if insert #1 fails (say replicas are unavailable), would insert #2 
 get applied? Would the whole thing be rejected and return an error to the 
 client? 
 
 PS. I’m aware of the isolation guarantees and that’s not an issue. All I 
 need to make sure is that if the first the statement failed, the whole batch 
 needs to fail.
 
 Thanks,
 
 Drew
 
 On Mar 17, 2014, at 5:33 AM, Jonathan Lacefield jlacefi...@datastax.com 
 wrote:
 
 Hello,
 
   Consistency is declared at the statement level, i.e. batch level when 
 writing, but enforced at each batch row level.  My understanding is that 
 each batch (and all of it's contents) will be controlled through a specific 
 CL declaration.  So batch A could use a CL of QUORUM while batch B could 
 use a CL of ONE.   
 
   The detail that may help sort this out for you is that batch statements 
 do not provide isolation guarantees: 
 www.datastax.com/documentation/cql/3.0/cql/cql_reference/batch_r.html.  
 This means that you write the batch as a batch but the reads are per row.  
 If you are reading records contained in the batch, you will read results of 
 partially updated batches.  Taking this into 

Cassandra blob storage

2014-03-18 Thread prem yadav
Hi,
I have been spending some time looking into whether large files(100mb) can
be stores in Cassandra. As per Cassandra faq:


*Currently Cassandra isn't optimized specifically for large file or BLOB
storage. However, files of around 64Mb and smaller can be easily stored in
the database without splitting them into smaller chunks. This is primarily
due to the fact that Cassandra's public API is based on Thrift, which
offers no streaming abilities; any value written or fetched has to fit in
to memory.*

Does the above statement still hold? Thrift supports framed data transport,
does that change the above statement. If not, why does casssandra not adopt
the Thrift framed data transfer support?

Thanks


Re: Cassandra blob storage

2014-03-18 Thread Brian O'Neill
You may want to look at:
https://github.com/Netflix/astyanax/wiki/Chunked-Object-Store

-brian

---
Brian O'Neill
Chief Technology Officer


Health Market Science
The Science of Better Results
2700 Horizon Drive € King of Prussia, PA € 19406
M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42   €
healthmarketscience.com


This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or the
person responsible to deliver it to the intended recipient, please contact
the sender at the email above and delete this email and any attachments and
destroy any copies thereof. Any review, retransmission, dissemination,
copying or other use of, or taking any action in reliance upon, this
information by persons or entities other than the intended recipient is
strictly prohibited.
 


From:  prem yadav ipremya...@gmail.com
Reply-To:  user@cassandra.apache.org
Date:  Tuesday, March 18, 2014 at 1:41 PM
To:  user@cassandra.apache.org
Subject:  Cassandra blob storage

Hi,
I have been spending some time looking into whether large files(100mb) can
be stores in Cassandra. As per Cassandra faq:

Currently Cassandra isn't optimized specifically for large file or BLOB
storage. However, files of around 64Mb and smaller can be easily stored in
the database without splitting them into smaller chunks. This is primarily
due to the fact that Cassandra's public API is based on Thrift, which offers
no streaming abilities; any value written or fetched has to fit in to
memory.

Does the above statement still hold? Thrift supports framed data transport,
does that change the above statement. If not, why does casssandra not adopt
the Thrift framed data transfer support?

Thanks





Re: Cassandra blob storage

2014-03-18 Thread prem yadav
Thanks Brian,
I have seen that. Its more of a workaround and a hack. Of course a great
solution.
But my question is more about why Cassandra itself can't support that. Give
then Thrift supports frames.

Thanks.


On Tue, Mar 18, 2014 at 5:55 PM, Brian O'Neill b...@alumni.brown.eduwrote:

 You may want to look at:
 https://github.com/Netflix/astyanax/wiki/Chunked-Object-Store

 -brian

 ---

 Brian O'Neill

 Chief Technology Officer


 *Health Market Science*

 *The Science of Better Results*

 2700 Horizon Drive * King of Prussia, PA * 19406

 M: 215.588.6024 * @boneill42 http://www.twitter.com/boneill42  *

 healthmarketscience.com


 This information transmitted in this email message is for the intended
 recipient only and may contain confidential and/or privileged material. If
 you received this email in error and are not the intended recipient, or the
 person responsible to deliver it to the intended recipient, please contact
 the sender at the email above and delete this email and any attachments and
 destroy any copies thereof. Any review, retransmission, dissemination,
 copying or other use of, or taking any action in reliance upon, this
 information by persons or entities other than the intended recipient is
 strictly prohibited.




 From: prem yadav ipremya...@gmail.com
 Reply-To: user@cassandra.apache.org
 Date: Tuesday, March 18, 2014 at 1:41 PM
 To: user@cassandra.apache.org
 Subject: Cassandra blob storage

 Hi,
 I have been spending some time looking into whether large files(100mb)
 can be stores in Cassandra. As per Cassandra faq:


 *Currently Cassandra isn't optimized specifically for large file or BLOB
 storage. However, files of around 64Mb and smaller can be easily stored in
 the database without splitting them into smaller chunks. This is primarily
 due to the fact that Cassandra's public API is based on Thrift, which
 offers no streaming abilities; any value written or fetched has to fit in
 to memory.*

 Does the above statement still hold? Thrift supports framed data
 transport, does that change the above statement. If not, why does
 casssandra not adopt the Thrift framed data transfer support?

 Thanks




Re: Cassandra blob storage

2014-03-18 Thread Mohit Anchlia
For large volume big data scenarios we don't recommend using Cassandra as a
blob storage simply because of intensive IO involved during compation,
repair etc. Cassandra store is only well suited for metadata type storage.
However, if you are fairly low volume then it's a different story, but if
you have low volume why use Cassandra :)

On Tue, Mar 18, 2014 at 10:55 AM, Brian O'Neill b...@alumni.brown.eduwrote:

 You may want to look at:
 https://github.com/Netflix/astyanax/wiki/Chunked-Object-Store

 -brian

 ---

 Brian O'Neill

 Chief Technology Officer


 *Health Market Science*

 *The Science of Better Results*

 2700 Horizon Drive * King of Prussia, PA * 19406

 M: 215.588.6024 * @boneill42 http://www.twitter.com/boneill42  *

 healthmarketscience.com


 This information transmitted in this email message is for the intended
 recipient only and may contain confidential and/or privileged material. If
 you received this email in error and are not the intended recipient, or the
 person responsible to deliver it to the intended recipient, please contact
 the sender at the email above and delete this email and any attachments and
 destroy any copies thereof. Any review, retransmission, dissemination,
 copying or other use of, or taking any action in reliance upon, this
 information by persons or entities other than the intended recipient is
 strictly prohibited.




 From: prem yadav ipremya...@gmail.com
 Reply-To: user@cassandra.apache.org
 Date: Tuesday, March 18, 2014 at 1:41 PM
 To: user@cassandra.apache.org
 Subject: Cassandra blob storage

 Hi,
 I have been spending some time looking into whether large files(100mb)
 can be stores in Cassandra. As per Cassandra faq:


 *Currently Cassandra isn't optimized specifically for large file or BLOB
 storage. However, files of around 64Mb and smaller can be easily stored in
 the database without splitting them into smaller chunks. This is primarily
 due to the fact that Cassandra's public API is based on Thrift, which
 offers no streaming abilities; any value written or fetched has to fit in
 to memory.*

 Does the above statement still hold? Thrift supports framed data
 transport, does that change the above statement. If not, why does
 casssandra not adopt the Thrift framed data transfer support?

 Thanks




How to extract information from commit log?

2014-03-18 Thread Han,Meng

Hi Cassandra hackers!

I have a question regarding extracting useful information from commit 
log.


Since its a binary log, how should I extract information such as 
timestamp, values from it? Does anyone know any binary log reader that I 
can use directly to read commit log?
If there is no such reader, could someone give me some advice hwo I can 
wrote such a reader?


Particularly, I want to know the order that write operations happens at 
each replica(cassandra server node) along with their timestamps, Does 
anyone know other methods how I can get this information without 
instrumenting Cassandra code?


Any help is appreciated!

Cheers,
Meng


Re: Cassandra blob storage

2014-03-18 Thread Vivek Mishra
@Mohit
Bit confused with your reply. For what use cases you find Cassandra useful
then?

-Vivek


On Tue, Mar 18, 2014 at 11:41 PM, Mohit Anchlia mohitanch...@gmail.comwrote:

 For large volume big data scenarios we don't recommend using Cassandra as
 a blob storage simply because of intensive IO involved during compation,
 repair etc. Cassandra store is only well suited for metadata type storage.
 However, if you are fairly low volume then it's a different story, but if
 you have low volume why use Cassandra :)


 On Tue, Mar 18, 2014 at 10:55 AM, Brian O'Neill b...@alumni.brown.eduwrote:

 You may want to look at:
 https://github.com/Netflix/astyanax/wiki/Chunked-Object-Store

 -brian

 ---

 Brian O'Neill

 Chief Technology Officer


 *Health Market Science*

 *The Science of Better Results*

 2700 Horizon Drive * King of Prussia, PA * 19406

 M: 215.588.6024 * @boneill42 http://www.twitter.com/boneill42  *

 healthmarketscience.com


 This information transmitted in this email message is for the intended
 recipient only and may contain confidential and/or privileged material. If
 you received this email in error and are not the intended recipient, or the
 person responsible to deliver it to the intended recipient, please contact
 the sender at the email above and delete this email and any attachments and
 destroy any copies thereof. Any review, retransmission, dissemination,
 copying or other use of, or taking any action in reliance upon, this
 information by persons or entities other than the intended recipient is
 strictly prohibited.




 From: prem yadav ipremya...@gmail.com
 Reply-To: user@cassandra.apache.org
 Date: Tuesday, March 18, 2014 at 1:41 PM
 To: user@cassandra.apache.org
 Subject: Cassandra blob storage

 Hi,
 I have been spending some time looking into whether large files(100mb)
 can be stores in Cassandra. As per Cassandra faq:


 *Currently Cassandra isn't optimized specifically for large file or BLOB
 storage. However, files of around 64Mb and smaller can be easily stored in
 the database without splitting them into smaller chunks. This is primarily
 due to the fact that Cassandra's public API is based on Thrift, which
 offers no streaming abilities; any value written or fetched has to fit in
 to memory. *

 Does the above statement still hold? Thrift supports framed data
 transport, does that change the above statement. If not, why does
 casssandra not adopt the Thrift framed data transfer support?

 Thanks





Re: Cassandra blob storage

2014-03-18 Thread Robert Coli
On Tue, Mar 18, 2014 at 10:41 AM, prem yadav ipremya...@gmail.com wrote:

 I have been spending some time looking into whether large files(100mb)
 can be stores in Cassandra. As per Cassandra faq:


https://code.google.com/p/mogilefs/

Cassandra is not optimized for single values of this size. Leaving aside
Thrift, trying to store 100mb in a single cell is not Cassandra's sweet
spot.

=Rob


Re: Cassandra blob storage

2014-03-18 Thread rcoli
On Tue, Mar 18, 2014 at 10:41 AM, prem yadav ipremya...@gmail.com wrote:

 I have been spending some time looking into whether large files(100mb)
 can be stores in Cassandra. As per Cassandra faq:


https://code.google.com/p/mogilefs/

Cassandra is not optimized for single values of this size. Leaving aside
Thrift, trying to store 100mb in a single cell is not Cassandra's sweet
spot.

=Rob


Re: How to extract information from commit log?

2014-03-18 Thread Jonathan Lacefield
Hello,

  Is this a one time investigative item or are you looking to set something
up to do this continuously?  Don't recommend trying to read the commit log.

  You can always use the WRITETIME function in CQL or look within SSTables
via the SStable2Json utility to see write times for particular versions of
partitions.

Jonathan



Jonathan Lacefield
Solutions Architect, DataStax
(404) 822 3487
http://www.linkedin.com/in/jlacefield


http://www.datastax.com/what-we-offer/products-services/training/virtual-training


On Tue, Mar 18, 2014 at 2:25 PM, Han,Meng meng...@ufl.edu wrote:

 Hi Cassandra hackers!

 I have a question regarding extracting useful information from commit log.

 Since its a binary log, how should I extract information such as
 timestamp, values from it? Does anyone know any binary log reader that I
 can use directly to read commit log?
 If there is no such reader, could someone give me some advice hwo I can
 wrote such a reader?

 Particularly, I want to know the order that write operations happens at
 each replica(cassandra server node) along with their timestamps, Does
 anyone know other methods how I can get this information without
 instrumenting Cassandra code?

 Any help is appreciated!

 Cheers,
 Meng



Re: Cassandra data migration from 1.9.8 to 2.0.2

2014-03-18 Thread Robert Coli
On Mon, Mar 17, 2014 at 6:19 PM, Lakshmi Kanth lk.c...@gmail.com wrote:

 Cassandra 1.9.8


No? This version does not exist.


 Cassandra 2.0.2


Cassandra 2.0.x versions up to and including 2.0.5 have serious issues.
Including this one, which randomly tombstones your data when you use
SSTableloader.

https://issues.apache.org/jira/browse/CASSANDRA-6527 (fixed in 2.0.4)

This is the reason why I do not currently recommend running Cassandra 2.0.x
in production.

https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/

If I were you, with such small cluster sizes, I would consider the
copy-all-data-to-all-nodes-and-run-cleanup method.

http://www.palominodb.com/blog/2012/09/25/bulk-loading-options-cassandra

=Rob


Re: How to extract information from commit log?

2014-03-18 Thread Han,Meng
  

Hi Jonathan, 

Thank you for the timely reply. I am doing this
experiment on a continuous basis. To be more specific, I will issue a
large amount of read and write operations to a particular key in a short
time interval. I'd like to know the order that write operations happens
at each replica. TImestamps definitely help to determine order, but the
WRITETIME and SStable2Json both looks me only return the timestamps when
that key was updated the moment the WRITETIME/SStable2Json is issued. It
looks like a one time thing to me. Or put in another way, if I want to
get the write time for all write operations in that short invertal to
determine a total order for write on that replia I have to constantly
issue WRITETIME to this replica? Correct me if I am wrong here. 

Light
me up pleease! 

On Tue, 18 Mar 2014 15:05:07 -0400, Jonathan
Lacefield wrote: 

 Hello, 
 Is this a one time investigative item or
are you looking to set something up to do this continuously? Don't
recommend trying to read the commit log. 
 You can always use the
WRITETIME function in CQL or look within SSTables via the SStable2Json
utility to see write times for particular versions of partitions. 

Jonathan 
 
 Jonathan Lacefield 
 Solutions Architect, DataStax 

(404) 822 3487 
 [1] 
 [2] 
 
 On Tue, Mar 18, 2014 at 2:25 PM,
Han,Meng wrote:
 
 Hi Cassandra hackers!
 
 I have a question
regarding extracting useful information from commit log.
 
 Since
its a binary log, how should I extract information such as timestamp,
values from it? Does anyone know any binary log reader that I can use
directly to read commit log?
 If there is no such reader, could
someone give me some advice hwo I can wrote such a reader?
 

Particularly, I want to know the order that write operations happens at
each replica(cassandra server node) along with their timestamps, Does
anyone know other methods how I can get this information without
instrumenting Cassandra code?
 
 Any help is appreciated!
 

Cheers,
 Meng

  

Links:
--
[1]
http://www.linkedin.com/in/jlacefield
[2]
http://www.datastax.com/what-we-offer/products-services/training/virtual-training
[3]
mailto:meng...@ufl.edu


Re: Cassandra data migration from 1.9.8 to 2.0.2

2014-03-18 Thread Lakshmi Kanth
Thanks Robert.

I wrongly mentioned the source Cassandra version.  Actual source version is
1.2.9.


Regards,
LakshmiKanth


On Tue, Mar 18, 2014 at 12:06 PM, Robert Coli rc...@eventbrite.com wrote:

 On Mon, Mar 17, 2014 at 6:19 PM, Lakshmi Kanth lk.c...@gmail.com wrote:

 Cassandra 1.9.8


 No? This version does not exist.


 Cassandra 2.0.2


 Cassandra 2.0.x versions up to and including 2.0.5 have serious issues.
 Including this one, which randomly tombstones your data when you use
 SSTableloader.

 https://issues.apache.org/jira/browse/CASSANDRA-6527 (fixed in 2.0.4)

 This is the reason why I do not currently recommend running Cassandra
 2.0.x in production.

 https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/

 If I were you, with such small cluster sizes, I would consider the
 copy-all-data-to-all-nodes-and-run-cleanup method.

 http://www.palominodb.com/blog/2012/09/25/bulk-loading-options-cassandra

 =Rob




Re: Multi-site Active-Active replication - Preparing Sites - Cluster Name and Snitch

2014-03-18 Thread Matthew Allen
Thanks Jonathan, points taken onboard.

I'll be testing the GossipingPropertyFileSnitch.  Just need to assess what
the impact is to a running-active cluster restarting 1 node at a time.
Given that I'm using Simple Strategy initially, I'm guessing there will be
no impact.

With regards to the non-application keyspaces,

  RowKey: system_auth
  = (column=durable_writes, value=true, timestamp=1394601216807000)
  = (column=strategy_class,
value=org.apache.cassandra.locator.SimpleStrategy,
timestamp=1394601216807000)
  = (column=strategy_options, value={replication_factor:1},
timestamp=1394601216807000)
  ---
  RowKey: system
  = (column=durable_writes, value=true, timestamp=1394601462264001)
  = (column=strategy_class,
value=org.apache.cassandra.locator.LocalStrategy,
timestamp=1394601462264001)
  = (column=strategy_options, value={}, timestamp=1394601462264001)
  ---
  RowKey: system_traces
  = (column=durable_writes, value=true, timestamp=1394601462327001)
  = (column=strategy_class,
value=org.apache.cassandra.locator.SimpleStrategy,
timestamp=1394601462327001)
  = (column=strategy_options, value={replication_factor:1},
timestamp=1394601462327001)

From what I can gather,
 - system_auth only needs to be replicated if not using the allowall
authenticator.
 - system_traces replication factor cannot be changed under 1.2.x (
https://issues.apache.org/jira/browse/CASSANDRA-6016)
 - system  keyspace is set to LocalStrategy.  This should not be replicated
as it contains data relevant to the local node.

So I think if I just update the Application Keyspace, I should be okay ?

Thanks for your help.

Matt


On Tue, Mar 18, 2014 at 12:34 AM, Jonathan Lacefield 
jlacefi...@datastax.com wrote:

 Hello,

   Please see comments under your

   1) Use GossipingPropertyFileSnitc:
 http://www.datastax.com/documentation/cassandra/1.2/cassandra/architecture/architectureSnitchGossipPF_c.html
  -
 much easier to manage
   2) All nodes in the same cluster must have the same cluster name:
 http://www.datastax.com/documentation/cassandra/1.2/cassandra/configuration/configCassandra_yaml_r.html
   3)  Run repair at the very end if you would like, rebuild should take
 care of this for you.  No need to do it when you are going from Simple
 (with 1 DC) to Network (with 1 dc).  Not sure you need to do step 2
 actually.
   4)  Yes, all Keyspaces should be updated as a part of this process.

   Hope that helps.

 Jonathan Lacefield
 Solutions Architect, DataStax
 (404) 822 3487
 http://www.linkedin.com/in/jlacefield


 http://www.datastax.com/what-we-offer/products-services/training/virtual-training


 On Sun, Mar 16, 2014 at 10:39 PM, Matthew Allen matthew.j.al...@gmail.com
  wrote:

 Hi all,

 New to this list, so apologies in advance if I in inadvertently break
 some of the guidelines.

 We currently have 2 geographically separate Cassandra/Application
 clusters (running in active/warm-standby mode), that I am looking to enable
 replication between so that we can have an active/active configuration.

 I've got the process working in our Labs, using
 http://www.datastax.com/documentation/cassandra/1.2/cassandra/operations/ops_add_dc_to_cluster_t.htmlas
  a guide, but still have many questions (to verify that what I have done
 is correct), so I'm trying to break down my questions into various emails.

 Our Setup
 ---
 - Our replication factor is currently set to 5 in both sites (NSW and
 VIC).  Each site has 9 nodes.
 - We use a read/write quorum of ONE
 - We have autoNodeDiscovery set to off in our app ( in anticipation of
 multi-site replication), so that it only points to its local Cassandra
 cluster
 - The 2 sites have a 16-20ms latency

 The Plan
 -
 1. Update and restart each node in active Cluster (NSW) 1 at a time to
 get it to use NetworkTopologySnitch in preparation of addition of standby
 cluster.
  - update cassandra-topologies.yaml file with settings as below so NSW
 Cluster is aware of NSW only
  - update cassandra.yaml to use PropertyFileSnitch
  - restart node

   # Cassandra Node IP=Data Center:Rack
 xxx.yy.zzz.144=DC_NSW:rack1
 xxx.yy.zzz.145=DC_NSW:rack1
 xxx.yy.zzz.146=DC_NSW:rack1
 xxx.yy.zzz.147=DC_NSW:rack1
 xxx.yy.zzz.148=DC_NSW:rack1
 ... and so forth for 9 nodes

 2. Update App Keyspace to use NetworkTopologySnitch with {'DC_NSW':5}

 3. Stop and blow away the standby cluster (VIC) and start afresh,
  - assign new tokens NSW+100
  - set auto_bootstrap: false
  - update seeds to point to mixture of VIC and NSW nodes.
  - update cassandra-topologies.yaml file with below so VIC Cluster is
 aware of VIC and NSW.
  - Leave cassandra cluster down

   # Cassandra Node IP=Data Center:Rack
 xxx.yy.zzz.144=DC_NSW:rack1
 xxx.yy.zzz.145=DC_NSW:rack1
 xxx.yy.zzz.146=DC_NSW:rack1
 xxx.yy.zzz.147=DC_NSW:rack1
 xxx.yy.zzz.148=DC_NSW:rack1
 ... and so forth for 9 nodes

 aaa.bb.ccc.144=DC_VIC:rack1
 

Multi-site Active-Active replication - Maintenance Commands

2014-03-18 Thread Matthew Allen
Hi all,

We currently run a script that runs the following on our separate clusters
each night.  The script operates on 3 nodes (out of 9) a night to minimize
impact to the cluster.

  repair -pr
  compact
  cleanup

These separate clusters are going to start replicating between each other
within a few weeks.

I (may) have (mis)read that when using multiple data-centers, that a
repair, instead of a repair -pr should be used.  I'm reluctant to do
this, as when we initially used to run a repair, we would suffer hung
repairs and an increased load on the hosts due to multiple ranges being
repaired at the same time.

So going forward to a multi-dc cluster, we intend to run the following

  -local repair -pr
  compact
  cleanup

Is there a definitive answer for this ?

Thanks

Matt


Setting up Cassandra across private network

2014-03-18 Thread Le Xu
I'm currently using Cassandra 1.23 and I'm trying to set up a cluster on
the private network,  so that nodes can communicate through eth4 inet
address instead of eth0 inet address.
My current yaml file specify eth4 address in the seed field. However the
the rpc address is set to 0.0.0.0. Both listen address and broadcast
address are not set.

Then I got error message:

ERROR 22:52:21,936 Exception encountered during startup
java.lang.RuntimeException: No other nodes seen!  Unable to bootstrap.If
you intended to start a single-node cluster, you should make sure your
broadcast_address (or listen_address) is listed as a seed.  Otherwise, you
need to determine why the seed being contacted has no knowledge of the rest
of the cluster.  Usually, this can be solved by giving all nodes the same
seed list.

My guess is the nodes were not be able to communicate with each other but
some unspecified field is resolved to eth0 address. Could it be the problem
with broadcast/listen address? My nodes share filesystem through NFS so if
eth4 address needs to be specified receptively on each node then separate
config file will be necessary.


Is it safe to remove (pid file folder) /var/run/cassandra

2014-03-18 Thread user 01
Is it safe to remove (pid file folder)* /var/run/cassandra* if I think it
has got wrong permissions, will it be recreated by process as before ?
folder permissions seem to be incorrect immediately after I installed
dsc20  due to this  sudo service cassandra start  fails saying - *
could not access pidfile for Cassandra.

So  is it safe to delete pid file folder(*/var/run/cassandra)*  when
cassandra is not running ?


Re: Is it safe to remove (pid file folder) /var/run/cassandra

2014-03-18 Thread user 01
I've verified that problem is solvable by deleting  *
/var/run/cassandra*folder on my test server. Should I remove the pid
folder from my new
production server ?


On Wed, Mar 19, 2014 at 10:54 AM, user 01 user...@gmail.com wrote:

 Is it safe to remove (pid file folder)* /var/run/cassandra* if I think it
 has got wrong permissions, will it be recreated by process as before ?
 folder permissions seem to be incorrect immediately after I installed
 dsc20  due to this  sudo service cassandra start  fails saying - *
 could not access pidfile for Cassandra.

 So  is it safe to delete pid file folder(*/var/run/cassandra)*  when
 cassandra is not running ?