Re: How to include all dependencies on cassandra driver jar?

2016-06-03 Thread James Carman
You could shade it into another jar.

On Fri, Jun 3, 2016 at 9:08 PM Carolina Simoes Gomes <
carolina.go...@huawei.com> wrote:

> Hello,
>
>
>
> I am using Cassandra 3.2.1 and the 3.0.0 driver. I need to build the
> driver but with all its dependencies included in the jar (uber jar), using
> maven, such that there are no external dependencies. How should I go about
> doing that?
>
>
>
> Thanks,
>
>
>
> *Carolina Gomes, M.Sc. *Systems Researcher
> Huawei Canada Research Center
> 19 Allstate Parkway, Suite 200
> Markham ON   L3R 5A
> carolina.go...@huawei.com 
> *www.huawei.ca* 
>


How to include all dependencies on cassandra driver jar?

2016-06-03 Thread Carolina Simoes Gomes
Hello,



I am using Cassandra 3.2.1 and the 3.0.0 driver. I need to build the driver but 
with all its dependencies included in the jar (uber jar), using maven, such 
that there are no external dependencies. How should I go about doing that?



Thanks,



Carolina Gomes, M.Sc.
Systems Researcher
Huawei Canada Research Center
19 Allstate Parkway, Suite 200
Markham ON   L3R 5A
carolina.go...@huawei.com
www.huawei.ca


RE: Token Ring Question

2016-06-03 Thread Anubhav Kale
Thank you, I was just curious about how this works.

From: Tyler Hobbs [mailto:ty...@datastax.com]
Sent: Friday, June 3, 2016 3:02 PM
To: user@cassandra.apache.org
Subject: Re: Token Ring Question

There really is only one token ring, but conceptually it's easiest to think of 
it like multiple rings, as OpsCenter shows it.  The only difference is that 
every token has to be unique across the whole cluster.
Now, if the token for a particular write falls in the “primary range” of a node 
living in DC2, does the code check for such conditions and instead put it on 
some node in DC1 ?

Yes.  It will continue searching around the token ring until it hits a token 
that belongs to a node in the correct datacenter.
What is the true meaning of “primary” token range in such scenarios ?

There's not really any such thing as a "primary token range", it's just a 
convenient idea for some tools.  In reality, it's just the replica that owns 
the first (clockwise) token.  I'm not sure what you're really asking, though -- 
what are you concerned about?


On Wed, Jun 1, 2016 at 2:40 PM, Anubhav Kale 
> wrote:
Hello,

I recently learnt that regardless of number of Data Centers, there is really 
only one token ring across all nodes. (I was under the impression that there is 
one per DC like how Datastax Ops Center would show it).

Suppose we have 4 v-nodes, and 2 DCs (2 nodes in each DC) and a key space is 
set to replicate in only one DC – say DC1.

Now, if the token for a particular write falls in the “primary range” of a node 
living in DC2, does the code check for such conditions and instead put it on 
some node in DC1 ? What is the true meaning of “primary” token range in such 
scenarios ?

Is this how things works roughly speaking or am I missing something ?

Thanks !



--
Tyler Hobbs
DataStax


Re: Token Ring Question

2016-06-03 Thread Tyler Hobbs
There really is only one token ring, but conceptually it's easiest to think
of it like multiple rings, as OpsCenter shows it.  The only difference is
that every token has to be unique across the whole cluster.

Now, if the token for a particular write falls in the “primary range” of a
> node living in DC2, does the code check for such conditions and instead put
> it on some node in DC1 ?
>

Yes.  It will continue searching around the token ring until it hits a
token that belongs to a node in the correct datacenter.

What is the true meaning of “primary” token range in such scenarios ?
>

There's not really any such thing as a "primary token range", it's just a
convenient idea for some tools.  In reality, it's just the replica that
owns the first (clockwise) token.  I'm not sure what you're really asking,
though -- what are you concerned about?


On Wed, Jun 1, 2016 at 2:40 PM, Anubhav Kale 
wrote:

> Hello,
>
>
>
> I recently learnt that regardless of number of Data Centers, there is
> really only one token ring across all nodes. (I was under the impression
> that there is one per DC like how Datastax Ops Center would show it).
>
>
>
> Suppose we have 4 v-nodes, and 2 DCs (2 nodes in each DC) and a key space
> is set to replicate in only one DC – say DC1.
>
>
>
> Now, if the token for a particular write falls in the “primary range” of a
> node living in DC2, does the code check for such conditions and instead put
> it on some node in DC1 ? What is the true meaning of “primary” token range
> in such scenarios ?
>
>
>
> Is this how things works roughly speaking or am I missing something ?
>
>
>
> Thanks !
>



-- 
Tyler Hobbs
DataStax 


Re: Blob or columns

2016-06-03 Thread Tyler Hobbs
On Fri, Jun 3, 2016 at 10:43 AM, Abhinav Solan 
wrote:

> Should we store these inconsequential data as blob or JSON in one column
> or create separate columns for them, which one should be the preferred way
> here ?


A blob will be more compact and require less server and driver resources
for serialization and deserialization.  Since you don't need to update
anything in the blob individually, I recommend going with that.


-- 
Tyler Hobbs
DataStax 


ANNOUNCE: Hecate 3.0.0.Beta1 Available...

2016-06-03 Thread James Carman
Fellow Cassandra Users,

We have been using a library we call "Hecate" to do Cassandra ORM-type
mapping for our clients for quite some time with tremendous success. We
have recently released a 3.0.0.Beta1 version for folks to try out. You can
find the source here:

https://github.com/savoirtech/hecate

with specific documentation about the POJO library here:

https://github.com/savoirtech/hecate/blob/hecate-3.0.x/pojo/README.md

The library is available in Maven Central:


  com.savoirtech.hecate
  hecate-pojo
  3.0.0.Beta1


We hope you enjoy using Hecate as much as we have and we welcome the
feedback.

Thanks,

James Carman


Blob or columns

2016-06-03 Thread Abhinav Solan
Hi Everyone,

We have a unique situation at my workplace while storing data.
We are using Cassandra as a write through cache where we keep real time
data in Cassandra for around 10 - 20 days and rest we archive it to another
data store as archived data.
The current data which we are going to store has around 20 columns, of
which 3 would be used in primary key and 2 more would be read by systems
which would query while working on Cassandra, rest of the columns are of no
use, only use of these columns are when these would be required
re-construct the data to be archived in our archive store which would be
accessed by our legacy applications.
The question here is -
Should we store these inconsequential data as blob or JSON in one column or
create separate columns for them, which one should be the preferred way
here ?
We are currently using Cassandra 3.x version.

Thanks,
Abhinav