Re: Question about opscenter Lifecycle Manager:

2016-08-04 Thread Manikandan Srinivasan
Hi Yuan,

Hope the hangout helped you. Feel free to reach out to me if you have any
further questions.

Regards
Mani

On Thu, Aug 4, 2016 at 3:41 PM, Yuan Fang  wrote:

> Hi Mani,
>
> Thanks so much Mani!
>
> I did enabled OpsCenter Auth use username and password.
> I previously used chrome.
>
> After I changed to Safari, it seems better, at least, it did not fail
> immediately.
> After a couple of seconds, it shows:
>
> Lifecycle Manager was unable to import Cassandra-Datastax-5 into Lifecycle
> Manager. For details about the error, see the Job Details.
>
>
> The job detail shows: job started .
>
>
> Yuan
>
>
>
>
>
> On Thu, Aug 4, 2016 at 3:33 PM, Manikandan Srinivasan <
> msriniva...@datastax.com> wrote:
>
>> Hi Yuan
>> Do you have OpsCenter Auth enabled? If yes, we are aware of this and have
>> fixed it internally. Please use Safari browser and this should work. On a
>> side note, I will DM you with more details.
>> Regards
>> Mani
>>
>> On Aug 4, 2016 3:26 PM, "Yuan Fang"  wrote:
>>
>>> Hello everyone,
>>>
>>> This is a problem that wasted me many days. It cost too much of my time
>>> than the convenience opsceter ever brought to us.
>>>
>>> The problem is as follows:
>>>
>>> I created a cluster and manually added to opscenter.
>>> I tried to use Lifecycle Manager to mange the cluster.
>>>
>>> From the Automatic Cluster Import page, I chose "SSH Credentials:".
>>>
>>> But it shows:
>>>
>>> "Lifecycle Manager was unable to import Cassandra-Datastax-5 due to the
>>> error shown below. For details about the error, see the OpsCenter logs.
>>> Unauthorized: You must be logged in for this operation".
>>>
>>> The credential has Login user: ubuntu
>>> SSH Login: private key
>>> Privileges Escalation: SUDO
>>>
>>> Any one knows why?
>>> Thanks a lot!
>>>
>>> Yuan
>>>
>>>
>


-- 
Regards,

Manikandan Srinivasan

Director, Product Management| +1.408.887.3686 |
manikandan.sriniva...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]




Re: Question about opscenter Lifecycle Manager:

2016-08-04 Thread Yuan Fang
Hi Mani,

Thanks so much Mani!

I did enabled OpsCenter Auth use username and password.
I previously used chrome.

After I changed to Safari, it seems better, at least, it did not fail
immediately.
After a couple of seconds, it shows:

Lifecycle Manager was unable to import Cassandra-Datastax-5 into Lifecycle
Manager. For details about the error, see the Job Details.


The job detail shows: job started .


Yuan





On Thu, Aug 4, 2016 at 3:33 PM, Manikandan Srinivasan <
msriniva...@datastax.com> wrote:

> Hi Yuan
> Do you have OpsCenter Auth enabled? If yes, we are aware of this and have
> fixed it internally. Please use Safari browser and this should work. On a
> side note, I will DM you with more details.
> Regards
> Mani
>
> On Aug 4, 2016 3:26 PM, "Yuan Fang"  wrote:
>
>> Hello everyone,
>>
>> This is a problem that wasted me many days. It cost too much of my time
>> than the convenience opsceter ever brought to us.
>>
>> The problem is as follows:
>>
>> I created a cluster and manually added to opscenter.
>> I tried to use Lifecycle Manager to mange the cluster.
>>
>> From the Automatic Cluster Import page, I chose "SSH Credentials:".
>>
>> But it shows:
>>
>> "Lifecycle Manager was unable to import Cassandra-Datastax-5 due to the
>> error shown below. For details about the error, see the OpsCenter logs.
>> Unauthorized: You must be logged in for this operation".
>>
>> The credential has Login user: ubuntu
>> SSH Login: private key
>> Privileges Escalation: SUDO
>>
>> Any one knows why?
>> Thanks a lot!
>>
>> Yuan
>>
>>


Re: Question about opscenter Lifecycle Manager:

2016-08-04 Thread Manikandan Srinivasan
Hi Yuan
Do you have OpsCenter Auth enabled? If yes, we are aware of this and have
fixed it internally. Please use Safari browser and this should work. On a
side note, I will DM you with more details.
Regards
Mani

On Aug 4, 2016 3:26 PM, "Yuan Fang"  wrote:

> Hello everyone,
>
> This is a problem that wasted me many days. It cost too much of my time
> than the convenience opsceter ever brought to us.
>
> The problem is as follows:
>
> I created a cluster and manually added to opscenter.
> I tried to use Lifecycle Manager to mange the cluster.
>
> From the Automatic Cluster Import page, I chose "SSH Credentials:".
>
> But it shows:
>
> "Lifecycle Manager was unable to import Cassandra-Datastax-5 due to the
> error shown below. For details about the error, see the OpsCenter logs.
> Unauthorized: You must be logged in for this operation".
>
> The credential has Login user: ubuntu
> SSH Login: private key
> Privileges Escalation: SUDO
>
> Any one knows why?
> Thanks a lot!
>
> Yuan
>
>


Re: Merging cells in compaction / compression?

2016-08-04 Thread DuyHai Doan
Look like you're asking for some sort of ETL on your C* data, why not use
Spark to compress those data into blobs and use User-Defined-Function to
explode them when reading ?

On Thu, Aug 4, 2016 at 10:08 PM, Michael Burman  wrote:

> Hi,
>
> No, I don't want to lose precision (if that's what you meant), but if you
> meant just storing them in a larger bucket (which I could decompress either
> on client side or server side). To clarify, it could be like:
>
> 04082016T230215.1234, value
> 04082016T230225.4321, value
> 04082016T230235.2563, value
> 04082016T230245.1145, value
> 04082016T230255.0204, value
>
> ->
>
> 04082016T230200 -> blob (that has all the points for this minute stored -
> no data is lost to aggregated avgs or sums or anything).
>
> That's acceptable, of course the prettiest solution would be to keep this
> hidden from a client so it would see while decompressing the original rows
> (like with byte[] compressors), but this is acceptable for my use-case. If
> this is what you meant, then yes.
>
>   -  Micke
>
> - Original Message -
> From: "Eric Stevens" 
> To: user@cassandra.apache.org
> Sent: Thursday, August 4, 2016 10:26:30 PM
> Subject: Re: Merging cells in compaction / compression?
>
> When you say merge cells, do you mean re-aggregating the data into courser
> time buckets?
>
> On Thu, Aug 4, 2016 at 5:59 AM Michael Burman  wrote:
>
> > Hi,
> >
> > Considering the following example structure:
> >
> > CREATE TABLE data (
> > metric text,
> > value double,
> > time timestamp,
> > PRIMARY KEY((metric), time)
> > ) WITH CLUSTERING ORDER BY (time DESC)
> >
> > The natural inserting order is metric, value, timestamp pairs, one
> > metric/value pair per second for example. That means creating more and
> more
> > cells to the same partition, which creates a large amount of overhead and
> > reduces the compression ratio of LZ4 & Deflate (LZ4 reaches ~0.26 and
> > Deflate ~0.10 ratios in some of the examples I've run). Now, to improve
> > compression ratio, how could I merge the cells on the actual Cassandra
> > node? I looked at ICompress and it provides only byte-level compression.
> >
> > Could I do this on the compaction phase, by extending the
> > DateTieredCompaction for example? It has SSTableReader/Writer facilities
> > and it seems to be able to see the rows? I'm fine with the fact that
> repair
> > run might have to do some conflict resolution as the final merged rows
> > would be quite "small" (50kB) in size. The naive approach is of course to
> > fetch all the rows from Cassandra - merge them on the client and send
> back
> > to the Cassandra, but this seems very wasteful and has its own problems.
> > Compared to table-LZ4 I was able to reduce the required size to 1/20th
> > (context-aware compression is sometimes just so much better) so there are
> > real benefits to this approach, even if I would probably violate multiple
> > design decisions.
> >
> > One approach is of course to write to another storage first and once the
> > blocks are ready, write them to Cassandra. But that again seems idiotic
> (I
> > know some people are using Kafka in front of Cassandra for example, but
> > that means maintaining yet another distributed solution and defeats the
> > benefit of Cassandra's easy management & scalability).
> >
> > Has anyone done something similar? Even planned? If I need to extend
> > something in Cassandra I can accept that approach also - but as I'm not
> > that familiar with Cassandra source code I could use some hints.
> >
> >   - Micke
> >
>


Re: Merging cells in compaction / compression?

2016-08-04 Thread Michael Burman
Hi,

No, I don't want to lose precision (if that's what you meant), but if you meant 
just storing them in a larger bucket (which I could decompress either on client 
side or server side). To clarify, it could be like:

04082016T230215.1234, value
04082016T230225.4321, value
04082016T230235.2563, value
04082016T230245.1145, value
04082016T230255.0204, value

-> 

04082016T230200 -> blob (that has all the points for this minute stored - no 
data is lost to aggregated avgs or sums or anything).

That's acceptable, of course the prettiest solution would be to keep this 
hidden from a client so it would see while decompressing the original rows 
(like with byte[] compressors), but this is acceptable for my use-case. If this 
is what you meant, then yes.

  -  Micke

- Original Message -
From: "Eric Stevens" 
To: user@cassandra.apache.org
Sent: Thursday, August 4, 2016 10:26:30 PM
Subject: Re: Merging cells in compaction / compression?

When you say merge cells, do you mean re-aggregating the data into courser
time buckets?

On Thu, Aug 4, 2016 at 5:59 AM Michael Burman  wrote:

> Hi,
>
> Considering the following example structure:
>
> CREATE TABLE data (
> metric text,
> value double,
> time timestamp,
> PRIMARY KEY((metric), time)
> ) WITH CLUSTERING ORDER BY (time DESC)
>
> The natural inserting order is metric, value, timestamp pairs, one
> metric/value pair per second for example. That means creating more and more
> cells to the same partition, which creates a large amount of overhead and
> reduces the compression ratio of LZ4 & Deflate (LZ4 reaches ~0.26 and
> Deflate ~0.10 ratios in some of the examples I've run). Now, to improve
> compression ratio, how could I merge the cells on the actual Cassandra
> node? I looked at ICompress and it provides only byte-level compression.
>
> Could I do this on the compaction phase, by extending the
> DateTieredCompaction for example? It has SSTableReader/Writer facilities
> and it seems to be able to see the rows? I'm fine with the fact that repair
> run might have to do some conflict resolution as the final merged rows
> would be quite "small" (50kB) in size. The naive approach is of course to
> fetch all the rows from Cassandra - merge them on the client and send back
> to the Cassandra, but this seems very wasteful and has its own problems.
> Compared to table-LZ4 I was able to reduce the required size to 1/20th
> (context-aware compression is sometimes just so much better) so there are
> real benefits to this approach, even if I would probably violate multiple
> design decisions.
>
> One approach is of course to write to another storage first and once the
> blocks are ready, write them to Cassandra. But that again seems idiotic (I
> know some people are using Kafka in front of Cassandra for example, but
> that means maintaining yet another distributed solution and defeats the
> benefit of Cassandra's easy management & scalability).
>
> Has anyone done something similar? Even planned? If I need to extend
> something in Cassandra I can accept that approach also - but as I'm not
> that familiar with Cassandra source code I could use some hints.
>
>   - Micke
>


Re: Merging cells in compaction / compression?

2016-08-04 Thread Eric Stevens
When you say merge cells, do you mean re-aggregating the data into courser
time buckets?

On Thu, Aug 4, 2016 at 5:59 AM Michael Burman  wrote:

> Hi,
>
> Considering the following example structure:
>
> CREATE TABLE data (
> metric text,
> value double,
> time timestamp,
> PRIMARY KEY((metric), time)
> ) WITH CLUSTERING ORDER BY (time DESC)
>
> The natural inserting order is metric, value, timestamp pairs, one
> metric/value pair per second for example. That means creating more and more
> cells to the same partition, which creates a large amount of overhead and
> reduces the compression ratio of LZ4 & Deflate (LZ4 reaches ~0.26 and
> Deflate ~0.10 ratios in some of the examples I've run). Now, to improve
> compression ratio, how could I merge the cells on the actual Cassandra
> node? I looked at ICompress and it provides only byte-level compression.
>
> Could I do this on the compaction phase, by extending the
> DateTieredCompaction for example? It has SSTableReader/Writer facilities
> and it seems to be able to see the rows? I'm fine with the fact that repair
> run might have to do some conflict resolution as the final merged rows
> would be quite "small" (50kB) in size. The naive approach is of course to
> fetch all the rows from Cassandra - merge them on the client and send back
> to the Cassandra, but this seems very wasteful and has its own problems.
> Compared to table-LZ4 I was able to reduce the required size to 1/20th
> (context-aware compression is sometimes just so much better) so there are
> real benefits to this approach, even if I would probably violate multiple
> design decisions.
>
> One approach is of course to write to another storage first and once the
> blocks are ready, write them to Cassandra. But that again seems idiotic (I
> know some people are using Kafka in front of Cassandra for example, but
> that means maintaining yet another distributed solution and defeats the
> benefit of Cassandra's easy management & scalability).
>
> Has anyone done something similar? Even planned? If I need to extend
> something in Cassandra I can accept that approach also - but as I'm not
> that familiar with Cassandra source code I could use some hints.
>
>   - Micke
>


Re: [Marketing Mail] Re: Memory leak and lockup on our 2.2.7 Cassandra cluster.

2016-08-04 Thread Jonathan Haddad
In the future you may find SASI indexes useful for indexing Cassandra data.

Shameless blog post plug:
http://rustyrazorblade.com/2016/02/cassandra-secondary-index-preview-1/
Deep technical dive: http://www.doanduyhai.com/blog/?p=2058

On Thu, Aug 4, 2016 at 11:45 AM Kevin Burton  wrote:

> BTW. we think we tracked this down to using large partitions to implement
> inverted indexes.  C* just doesn't do a reasonable job at all with large
> partitions so we're going to migrate this use case to using Elasticsearch
>
> On Wed, Aug 3, 2016 at 1:54 PM, Ben Slater 
> wrote:
>
>> Yep,  that was what I was referring to.
>>
>>
>> On Thu, 4 Aug 2016 2:24 am Reynald Bourtembourg <
>> reynald.bourtembo...@esrf.fr> wrote:
>>
>>> Hi,
>>>
>>> Maybe Ben was referring to this issue which has been mentioned recently
>>> on this mailing list:
>>> https://issues.apache.org/jira/browse/CASSANDRA-11887
>>>
>>> Cheers,
>>> Reynald
>>>
>>>
>>> On 03/08/2016 18:09, Romain Hardouin wrote:
>>>
>>> > Curious why the 2.2 to 3.x upgrade path is risky at best.
>>> I guess that upgrade from 2.2 is less tested by DataStax QA because DSE4
>>> used C* 2.1, not 2.2.
>>> I would say the safest upgrade is 2.1 to 3.0.x
>>>
>>> Best,
>>>
>>> Romain
>>>
>>>
>>> --
>> 
>> Ben Slater
>> Chief Product Officer
>> Instaclustr: Cassandra + Spark - Managed | Consulting | Support
>> +61 437 929 798
>>
>
>
>
> --
>
> We’re hiring if you know of any awesome Java Devops or Linux Operations
> Engineers!
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> 
>
>


Re: [Marketing Mail] Re: Memory leak and lockup on our 2.2.7 Cassandra cluster.

2016-08-04 Thread Kevin Burton
BTW. we think we tracked this down to using large partitions to implement
inverted indexes.  C* just doesn't do a reasonable job at all with large
partitions so we're going to migrate this use case to using Elasticsearch

On Wed, Aug 3, 2016 at 1:54 PM, Ben Slater 
wrote:

> Yep,  that was what I was referring to.
>
>
> On Thu, 4 Aug 2016 2:24 am Reynald Bourtembourg <
> reynald.bourtembo...@esrf.fr> wrote:
>
>> Hi,
>>
>> Maybe Ben was referring to this issue which has been mentioned recently
>> on this mailing list:
>> https://issues.apache.org/jira/browse/CASSANDRA-11887
>>
>> Cheers,
>> Reynald
>>
>>
>> On 03/08/2016 18:09, Romain Hardouin wrote:
>>
>> > Curious why the 2.2 to 3.x upgrade path is risky at best.
>> I guess that upgrade from 2.2 is less tested by DataStax QA because DSE4
>> used C* 2.1, not 2.2.
>> I would say the safest upgrade is 2.1 to 3.0.x
>>
>> Best,
>>
>> Romain
>>
>>
>> --
> 
> Ben Slater
> Chief Product Officer
> Instaclustr: Cassandra + Spark - Managed | Consulting | Support
> +61 437 929 798
>



-- 

We’re hiring if you know of any awesome Java Devops or Linux Operations
Engineers!

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile



Merging cells in compaction / compression?

2016-08-04 Thread Michael Burman
Hi,

Considering the following example structure:

CREATE TABLE data (
metric text,
value double,
time timestamp,
PRIMARY KEY((metric), time)
) WITH CLUSTERING ORDER BY (time DESC)

The natural inserting order is metric, value, timestamp pairs, one metric/value 
pair per second for example. That means creating more and more cells to the 
same partition, which creates a large amount of overhead and reduces the 
compression ratio of LZ4 & Deflate (LZ4 reaches ~0.26 and Deflate ~0.10 ratios 
in some of the examples I've run). Now, to improve compression ratio, how could 
I merge the cells on the actual Cassandra node? I looked at ICompress and it 
provides only byte-level compression.

Could I do this on the compaction phase, by extending the DateTieredCompaction 
for example? It has SSTableReader/Writer facilities and it seems to be able to 
see the rows? I'm fine with the fact that repair run might have to do some 
conflict resolution as the final merged rows would be quite "small" (50kB) in 
size. The naive approach is of course to fetch all the rows from Cassandra - 
merge them on the client and send back to the Cassandra, but this seems very 
wasteful and has its own problems. Compared to table-LZ4 I was able to reduce 
the required size to 1/20th (context-aware compression is sometimes just so 
much better) so there are real benefits to this approach, even if I would 
probably violate multiple design decisions. 

One approach is of course to write to another storage first and once the blocks 
are ready, write them to Cassandra. But that again seems idiotic (I know some 
people are using Kafka in front of Cassandra for example, but that means 
maintaining yet another distributed solution and defeats the benefit of 
Cassandra's easy management & scalability).

Has anyone done something similar? Even planned? If I need to extend something 
in Cassandra I can accept that approach also - but as I'm not that familiar 
with Cassandra source code I could use some hints. 

  - Micke