from:"Kai Wang"

Re: Why does Cassandra recommends Oracle JVM instead of OpenJDK?

2017-01-03 Thread Kai Wang

Back in the day, HotSpot was recommended because OpenJDK had some stability
and performance issues. But in 2015 or maybe 2014 I heard in a presentation
(don't remember by whom) that OpenJDK is pretty on par with HotSpot for C*.

But I guess the documentation was never properly updated.

On Tue, Jan 3, 2017 at 2:50 AM, Kant Kodali  wrote:

> The fact that Oracle would even come up with something like this "Oracle's
> position was that Google should have to license code from them." is just
> messed up. And these kind of business practices are exactly the reason why
> to stay away. Of course every company is there to make money. You look at
> Google or FB and see how much open source contribution they have
> done. Oracle doesnt come anywhere close to that.
>
> On Mon, Jan 2, 2017 at 8:08 PM, Edward Capriolo 
> wrote:
>
>>
>>
>> On Mon, Jan 2, 2017 at 8:30 PM, Kant Kodali  wrote:
>>
>>> This is a subjective question and of course it would turn into
>>> opinionated answers and I think we should welcome that (Nothing wrong in
>>> debating a topic). we have many such debates as SE's such as programming
>>> language comparisons, Architectural debates, Framework/Library debates and
>>> so on. people who don't like this conversation can simply refrain from
>>> following this thread right. I don't know why they choose to Jump in if
>>> they dont like a topic
>>>
>>> Sun is a great company no doubt! I don't know if Oracle is. Things like
>>> this https://www.extremetech.com/mobile/220136-google-plans-
>>> to-remove-oracles-java-apis-from-android-n is what pisses me about
>>> Oracle which gives an impression that they are not up for open source. It
>>> would be awesome to see JVM running on more and more devices (not less) so
>>> Google taking away Oracle Java API's from Android is a big failure from
>>> Oracle.
>>>
>>> JVM is a great piece of Software and by far there isn't anything yet
>>> that comes close. And there are great people who worked at SUN at that time.
>>> open the JDK source code and read it. you will encounter some great
>>> ideas and Algorithms.
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Jan 2, 2017 at 1:04 PM, Edward Capriolo 
>>> wrote:
>>>

 On Mon, Jan 2, 2017 at 3:51 PM, Benjamin Roth 
 wrote:

> Does this discussion really make sense any more? To me it seems it
> turned opinionated and religious. From my point of view anything that has
> to be said was said.
>
> Am 02.01.2017 21:27 schrieb "Edward Capriolo" :
>
>>
>>
>> On Mon, Jan 2, 2017 at 11:56 AM, Eric Evans <
>> john.eric.ev...@gmail.com> wrote:
>>
>>> On Fri, Dec 23, 2016 at 9:15 PM, Edward Capriolo <
>>> edlinuxg...@gmail.com> wrote:
>>> > "I don't really have any opinions on Oracle per say, but Cassandra
>>> is a
>>> > Free Software project and I would prefer that we not depend on
>>> > commercial software, (and that's kind of what we have here, an
>>> > implicit dependency)."
>>> >
>>> > We are a bit loose here with terms "free" and "commercial". The
>>> oracle JVM
>>> > is open source, it is free to use and the trademark is owned by a
>>> company.
>>>
>>> Are we?  There are many definitions for the word "free", only one of
>>> which means "without cost"; I assumed it was obvious that I was
>>> talking about licensing terms (and of course the implications of that
>>> licensing).
>>>
>>> Cassandra is Free Software by virtue of the fact that it is Apache
>>> Licensed.  You are Free (as in Freedom) to modify and redistribute
>>> it.
>>>
>>> The Oracle JVM ships with a commercial license.  It is free only in
>>> the sense that you are not required to pay anything to use it, (i.e.
>>> you are not Free to do much of anything other than use it to run Java
>>> software).
>>>
>>> > That is not much different then using a tool for cassandra like a
>>> driver
>>> > hosted on github but made my a company.
>>>
>>> It is very different IME.  Cassandra requires a JVM to function, this
>>> is a hard dependency.  A driver is merely a means to make use of it.
>>>
>>> > The thing about a JVM is that like a kernel you want really smart
>>> dedicated
>>> > people working on it. Oracle has moved the JVM forward since
>>> taking over
>>> > sun. You can not just manage a JVM like say the freebsd port of x
>>> maintained
>>> > by 3 part time dudes that all get paid to do something else.
>>>
>>> I don't how to read any of this.  It sounds like you're saying that a
>>> JVM is something that cannot be produced as a Free Software project,
>>> or maybe that you just really like Oracle, I'm honestly not sure.  It
>>> doesn't seem relevant though, because there is in fact a Free
>>> Software
>>> JVM

Re: Cassandra 2.x Stability

2016-12-01 Thread Kai Wang

Ben, I just read through those two tickets. It's scarier than I thought.
Thank you for all the investigations and comments.

On Thu, Dec 1, 2016 at 10:31 AM, Benjamin Roth <benjamin.r...@jaumo.com>
wrote:

> A little experience report on MVs:
>
> We use them in production (3.10-trunk) and they work really well on normal
> read/write operations but streaming operations (bootstrap, repair, rebuild,
> decommision) can kill your cluster and/or your nerves.
> We will stay with MVs as we need them and want them.
> I rolled out a patch on MV streaming on our production cluster a few hours
> ago as we had problems with bootstrapping new nodes.
>
> Before:
> - Error log was completely flooded with WTEs
> - Bootstrap either failed due to exceptions or wasn't even close to finish
> after 24h - it just did not work
>
> After
> - Bootstrap finished without a single error log after less than 5:30h
>
> I started to roll out that patch to the whole cluster to see how repairs
> are affected. Will keep you updated.
>
> There is no dedicated JIRA issue assigned as it addresses multiple tickets
> like CASSANDRA-12905 + CASSANDRA-12888
>
>
> 2016-12-01 16:21 GMT+01:00 Jonathan Haddad <j...@jonhaddad.com>:
>
>> I agree with everything you just said, Kai.  I'd start a new project with
>> 3.0.10.  I'd stay away from MVs though.
>>
>> On Thu, Dec 1, 2016 at 10:19 AM Kai Wang <dep...@gmail.com> wrote:
>>
>>> Just based on a few observations on this list. Not one week goes by
>>> without people asking which release is the most stable on 3.x line. Folks
>>> at instaclustr also provide their own 3.x fork for stability issues. etc
>>>
>>> We developers already have enough to think about. I really don't feel
>>> like spending time researching which release of C* I should choose. So for
>>> me, 2.2.x is the choice in production.
>>>
>>> That being said, I have nothing against 3.x. I do like its new storage
>>> engine. If I start a brand new project today with zero previous C*
>>> experience, I probably would choose 3.0.10 as my starting point. However if
>>> I were to upgrade to 3.x, I would have to test it thoroughly in a dev
>>> environment with real production load and monitor it very closely on
>>> performance, compaction, repair, bootstrap, replacing etc. Data is simply
>>> too important to take chances with.
>>>
>>>
>>> On Thu, Dec 1, 2016 at 9:38 AM, Shalom Sagges <shal...@liveperson.com>
>>> wrote:
>>>
>>> Hey Kai,
>>>
>>> Thanks for the info. Can you please elaborate on the reasons you'd pick
>>> 2.2.6 over 3.0?
>>>
>>>
>>> Shalom Sagges
>>> DBA
>>> T: +972-74-700-4035 <+972%2074-700-4035>
>>> <http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
>>> <http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
>>>
>>> <https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email_source=mkto_campaign=idcsig>
>>>
>>>
>>> On Thu, Dec 1, 2016 at 2:26 PM, Kai Wang <dep...@gmail.com> wrote:
>>>
>>> I have been running 2.2.6 in production. As of today I would still pick
>>> it over 3.x for production.
>>>
>>> On Nov 30, 2016 5:42 AM, "Shalom Sagges" <shal...@liveperson.com> wrote:
>>>
>>> Hi Everyone,
>>>
>>> I'm about to upgrade our 2.0.14 version to a newer 2.x version.
>>> At first I thought of upgrading to 2.2.8, but I'm not sure how stable it
>>> is, as I understand the 2.2 version was supposed to be a sort of beta
>>> version for 3.0 feature-wise, whereas 3.0 upgrade will mainly handle the
>>> storage modifications (please correct me if I'm wrong).
>>>
>>> So my question is, if I need a 2.x version (can't upgrade to 3 due to
>>> client considerations), which one should I choose, 2.1.x or 2.2.x? (I'm
>>> don't require any new features available in 2.2).
>>>
>>> Thanks!
>>>
>>> Shalom Sagges
>>> DBA
>>> T: +972-74-700-4035 <+972%2074-700-4035>
>>> <http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
>>> <http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
>>>
>>> <https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email_source=mkto_campaign=idcsig>
>>>
>>>
>>> This message may contain confidential and/or privileged information.
>>&g

Re: Cassandra 2.x Stability

2016-12-01 Thread Kai Wang

Just based on a few observations on this list. Not one week goes by without
people asking which release is the most stable on 3.x line. Folks at
instaclustr also provide their own 3.x fork for stability issues. etc

We developers already have enough to think about. I really don't feel like
spending time researching which release of C* I should choose. So for me,
2.2.x is the choice in production.

That being said, I have nothing against 3.x. I do like its new storage
engine. If I start a brand new project today with zero previous C*
experience, I probably would choose 3.0.10 as my starting point. However if
I were to upgrade to 3.x, I would have to test it thoroughly in a dev
environment with real production load and monitor it very closely on
performance, compaction, repair, bootstrap, replacing etc. Data is simply
too important to take chances with.


On Thu, Dec 1, 2016 at 9:38 AM, Shalom Sagges <shal...@liveperson.com>
wrote:

> Hey Kai,
>
> Thanks for the info. Can you please elaborate on the reasons you'd pick
> 2.2.6 over 3.0?
>
>
> Shalom Sagges
> DBA
> T: +972-74-700-4035 <+972%2074-700-4035>
> <http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
> <http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
>
> <https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email_source=mkto_campaign=idcsig>
>
>
> On Thu, Dec 1, 2016 at 2:26 PM, Kai Wang <dep...@gmail.com> wrote:
>
>> I have been running 2.2.6 in production. As of today I would still pick
>> it over 3.x for production.
>>
>> On Nov 30, 2016 5:42 AM, "Shalom Sagges" <shal...@liveperson.com> wrote:
>>
>>> Hi Everyone,
>>>
>>> I'm about to upgrade our 2.0.14 version to a newer 2.x version.
>>> At first I thought of upgrading to 2.2.8, but I'm not sure how stable it
>>> is, as I understand the 2.2 version was supposed to be a sort of beta
>>> version for 3.0 feature-wise, whereas 3.0 upgrade will mainly handle the
>>> storage modifications (please correct me if I'm wrong).
>>>
>>> So my question is, if I need a 2.x version (can't upgrade to 3 due to
>>> client considerations), which one should I choose, 2.1.x or 2.2.x? (I'm
>>> don't require any new features available in 2.2).
>>>
>>> Thanks!
>>>
>>> Shalom Sagges
>>> DBA
>>> T: +972-74-700-4035 <+972%2074-700-4035>
>>> <http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
>>> <http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
>>>
>>> <https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email_source=mkto_campaign=idcsig>
>>>
>>>
>>> This message may contain confidential and/or privileged information.
>>> If you are not the addressee or authorized to receive this on behalf of
>>> the addressee you must not use, copy, disclose or take action based on this
>>> message or any information herein.
>>> If you have received this message in error, please advise the sender
>>> immediately by reply email and delete this message. Thank you.
>>>
>>
>
> This message may contain confidential and/or privileged information.
> If you are not the addressee or authorized to receive this on behalf of
> the addressee you must not use, copy, disclose or take action based on this
> message or any information herein.
> If you have received this message in error, please advise the sender
> immediately by reply email and delete this message. Thank you.
>

Re: Cassandra 2.x Stability

2016-12-01 Thread Kai Wang

I have been running 2.2.6 in production. As of today I would still pick it
over 3.x for production.

On Nov 30, 2016 5:42 AM, "Shalom Sagges"  wrote:

> Hi Everyone,
>
> I'm about to upgrade our 2.0.14 version to a newer 2.x version.
> At first I thought of upgrading to 2.2.8, but I'm not sure how stable it
> is, as I understand the 2.2 version was supposed to be a sort of beta
> version for 3.0 feature-wise, whereas 3.0 upgrade will mainly handle the
> storage modifications (please correct me if I'm wrong).
>
> So my question is, if I need a 2.x version (can't upgrade to 3 due to
> client considerations), which one should I choose, 2.1.x or 2.2.x? (I'm
> don't require any new features available in 2.2).
>
> Thanks!
>
> Shalom Sagges
> DBA
> T: +972-74-700-4035 <+972%2074-700-4035>
>  
>  We Create Meaningful Connections
>
> 
>
>
> This message may contain confidential and/or privileged information.
> If you are not the addressee or authorized to receive this on behalf of
> the addressee you must not use, copy, disclose or take action based on this
> message or any information herein.
> If you have received this message in error, please advise the sender
> immediately by reply email and delete this message. Thank you.
>

full repair or incremental repair after scrub?

2016-11-30 Thread Kai Wang

Hi, do I have to do a full repair after scrub? Is it enough to just do
incremental repair? BTW I do nightly incremental repair.

Re: Storing videos in cassandra

2016-11-19 Thread Kai Wang

IIRC, I watched a presentation where they said Netflix store almost
everything in C* *except* video content and payment stuff.

That was 1-2 years ago. Not sure if it's still the case.

On Nov 14, 2016 12:03 PM, "raghavendra vutti" 
wrote:

> Hi,
>
>  Just wanted to know How does hulu or netflix store videos in cassandra.
>
> Do they just use references to the video files in the form of URL's and
> store in the DB??
>
> could someone please me on this.
>
>
> Thanks,
> Raghavendra.
>
>
>
>
>
>
>
>
>
>
>

Re: Corrupt SSTABLE over and over

2016-08-17 Thread Kai Wang

This might not be good news to you. But my experience is that C*
2.X/Windows is not ready for production yet. I've seen various file system
related errors. And in one of the JIRAs I was told major work (or rework)
is done in 3.X to improve C* stability on Windows.

On Tue, Aug 16, 2016 at 3:44 AM, Bryan Cheng  wrote:

> Hi Alaa,
>
> Sounds like you have problems that go beyond Cassandra- likely filesystem
> corruption or bad disks. I don't know enough about Windows to give you any
> specific advice but I'd try a run of chkdsk to start.
>
> --Bryan
>
> On Fri, Aug 12, 2016 at 5:19 PM, Alaa Zubaidi (PDF) 
> wrote:
>
>> Hi Bryan,
>>
>> Changing disk_failure_policy to best_effort, and running nodetool scrub,
>> did not work, it generated another error:
>> java.nio.file.AccessDeniedException
>>
>> Also tried to remove all files (data, commitlog, savedcaches) and restart
>> the node fresh, and still I am getting corruption.
>>
>> and Still nothing that indicate there is a HW issue?
>> All other nodes are fine
>>
>> Regards,
>> Alaa
>>
>>
>> On Fri, Aug 12, 2016 at 12:00 PM, Bryan Cheng 
>> wrote:
>>
>>> Should also add that if the scope of corruption is _very_ large, and you
>>> have a good, aggressive repair policy (read: you are confident in the
>>> consistency of the data elsewhere in the cluster), you may just want to
>>> decommission and rebuild that node.
>>>
>>> On Fri, Aug 12, 2016 at 11:55 AM, Bryan Cheng 
>>> wrote:
>>>
 Looks like you're doing the offline scrub- have you tried online?

 Here's my typical process for corrupt SSTables.

 With disk_failure_policy set to stop, examine the failing sstables. If
 they are very small (in the range of kbs), it is unlikely that there is any
 salvageable data there. Just delete them, start the machine, and schedule a
 repair ASAP.

 If they are large, then it may be worth salvaging. If the scope of
 corruption is reasonable (limited to a few sstables scattered among
 different keyspaces), set disk_failure_policy to best_effort, start the
 machine up, and run the nodetool scrub. This is online scrub, faster than
 offline scrub (at least of 2.1.12, the last time I had to do this).

 Only if all else fails, attempt the very painful offline sstablescrub.

 Is the VMWare client Windows? (Trying to make sure its not just the
 host). YMMV but in the past Windows was somewhat of a neglected platform
 wrt Cassandra. I think you'd have a lot easier time getting help if running
 Linux is an option here.



 On Fri, Aug 12, 2016 at 9:16 AM, Alaa Zubaidi (PDF) <
 alaa.zuba...@pdf.com> wrote:

> Hi Jason,
>
> Thanks for your input...
> Thats what I am afraid of?
> Did you find any HW error in the VMware and HW logs? any indication
> that the HW is the reason? I need to make sure that this is the reason
> before asking the customer to spend more money?
>
> Thanks,
> Alaa
>
> On Thu, Aug 11, 2016 at 11:02 PM, Jason Wee 
> wrote:
>
>> cassandra run on virtual server (vmware)?
>>
>> > I tried sstablescrub but it crashed with hs-err-pid-...
>> maybe try with larger heap allocated to sstablescrub
>>
>> this sstable corrupt i ran into it as well (on cassandra 1.2), first i
>> try nodetool scrub, still persist, then offline sstablescrub still
>> persist, wipe the node and it happen again, then i change the hardware
>> (disk and mem). things went good.
>>
>> hth
>>
>> jason
>>
>>
>> On Fri, Aug 12, 2016 at 9:20 AM, Alaa Zubaidi (PDF)
>>  wrote:
>> > Hi,
>> >
>> > I have a 16 Node cluster, Cassandra 2.2.1 on Windows, local
>> installation
>> > (NOT on the cloud)
>> >
>> > and I am getting
>> > Error [CompactionExecutor:2] 2016-08-12 06:51:52, 983 Cassandra
>> > Daemon.java:183 - Execption in thread Thread[CompactionExecutor:2,1m
>> ain]
>> > org.apache.cassandra.io.FSReaderError:
>> > org.apache.cassandra.io.sstable.CorruptSSTableExecption:
>> > org.apache.cassandra.io.compress.CurrptBlockException:
>> > (E:\\la-4886-big-Data.db): corruption detected, chunk at
>> 4969092 of
>> > length 10208.
>> > at
>> > org.apache.cassandra.io.util.RandomAccessReader.readBytes(Ra
>> ndomAccessReader.java:357)
>> > ~[apache-cassandra-2.2.1.jar:2.2.1]
>> > 
>> > 
>> > ERROR [CompactionExecutor:2] ... FileUtils.java:463 - Existing
>> > forcefully due to file system exception on startup, disk failure
>> policy
>> > "stop"
>> >
>> > I tried sstablescrub but it crashed with hs-err-pid-...
>> > I removed the corrupted file and started the Node again, after one
>> day the
>> > corruption came back again,

Re: sstableloader

2016-08-17 Thread Kai Wang

yes, you are correct.

On Tue, Aug 16, 2016 at 2:37 PM, Jean Tremblay <
jean.tremb...@zen-innovations.com> wrote:

> Hi,
>
> I’m using Cassandra 3.7.
>
> In the documentation for sstableloader I read the following:
>
> << Note: To get the best throughput from SSTable loading, you can use
> multiple instances of sstableloader to stream across multiple machines. No
> hard limit exists on the number of SSTables that sstableloader can run at
> the same time, so you can add additional loaders until you see no further
> improvement.>>
>
> Does this mean that I can stream my sstables to my cluster from many
> instance of sstableloader running simultaneously on many client machines?
>
> I ask because I would like to improve the transfer speed of my stables to
> my cluster.
>
> Kind regards and thanks for your comments.
>
> Jean
>

Re: Cassandra monitoring

2016-06-15 Thread Kai Wang

I use graphite/jmxtrans/collectd to monitor not just Cassandra but also
other jvm applications as well as OS. I found it's more useful and flexible
than opscenter in terms of monitoring.
On Jun 14, 2016 3:10 PM, "Arun Ramakrishnan" 
wrote:

What are the options for a very small and nimble startup to do keep a
cassandra cluster running well oiled. We are on AWS. We are interested in a
monitoring tool and potentially also cluster management tools.

We are currently on apache cassandra 3.7. We were hoping the datastax
opscenter would be it (It is free for startups our size). But, looks like
it does not support cassandra versions greater than v2.1. It is pretty
surprising considering cassandra v2.1  came out in 2014.

We would consider downgrading to datastax cassandra 2.1 just to have robust
monitoring tools. But, I am not sure if having opscenter offsets all the
improvements that have been added to cassandra since 2.1.

Sematext has a integrations for monitoring cassandra. Does anyone have good
experience with it ?

How much work would be involved to setup Ganglia or some such option for
cassandra ?

Thanks,
Arun

Re: how long does "nodetool upgradesstables" take?

2016-06-04 Thread Kai Wang

Jeff,

Thank you very much for the answers.

-Kai

On Sat, Jun 4, 2016 at 3:39 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com>
wrote:

> “It takes as long as necessary to rewrite any sstable that needs to be
> upgraded”.
>
> From 2.2.4 to 2.2.6, the sstable format did not change, so there’s nothing
> to upgrade.
>
> If you want to force the matter (and you probably don’t), ‘nodetool
> upgradesstables –a’ will rewrite them again, but you gain virtually nothing
> in the process.
>
> The version on disk will be a two letter sequence in the –Data.db file
> name – 2.0 uses –jb, 2.1 uses –ka, 2.2 uses –la, and so on. You should see
> files with –la in the name on 2.2.4 and 2.2.6.
>
> If a new format was added to 2.2, it would likely be –lb ( -l for 2.2, b
> for the second format ), and it would be documented at
> https://github.com/apache/cassandra/blob/cassandra-2.2/NEWS.txt
>
> - Jeff
>
> From: Kai Wang
> Reply-To: "user@cassandra.apache.org"
> Date: Saturday, June 4, 2016 at 9:36 AM
> To: "user@cassandra.apache.org"
> Subject: how long does "nodetool upgradesstables" take?
>
> I just upgrade C* from 2.2.4 to 2.2.6. I ran "nodetool upgradesstables"
> and it returned within a few seconds. Does this sound right? A few
> questions:
>
> 1. is it possible that sstable formats are the same between those versions
> that's why upgradesstables took almost no time?
> 2. is there a way to confirm the sstable version on disk?
> 3. is there a way to know if sstable format has changed from version to
> version?
>
> Thanks.
>

how long does "nodetool upgradesstables" take?

2016-06-04 Thread Kai Wang

I just upgrade C* from 2.2.4 to 2.2.6. I ran "nodetool upgradesstables" and
it returned within a few seconds. Does this sound right? A few questions:

1. is it possible that sstable formats are the same between those versions
that's why upgradesstables took almost no time?
2. is there a way to confirm the sstable version on disk?
3. is there a way to know if sstable format has changed from version to
version?

Thanks.

Re: Out of memory issues

2016-05-27 Thread Kai Wang

Paolo,

try a few things in cassandra-env.sh
1. HEAP_NEWSIZE="2G". "The 100mb/core commentary in cassandra-env.sh for
setting HEAP_NEWSIZE is *wrong*" (
https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html)
2. MaxTenuringThreshold=8
3. enable GC logging (under "# GC logging options -- uncomment to enable"
section) to compare GC behaviors on good and bad nodes.


On Fri, May 27, 2016 at 5:36 AM, Paolo Crosato <
paolo.cros...@targaubiest.com> wrote:

> Hi,
>
> thanks for the answer. There were no large insertions and the saved_caches
> dir had a resonable size. I tried to delete the cashes and set
> key_cache_size_in_mb to zero, but it didn't help.
> Today our virtual hardware provided raised cpus to 4, memory to 32GB and
> doubled the disk size, and the nodes are stable again. So it was probably
> an issue of severe lack of resources.
> About HEAP_NEWSIZE, your suggestion is quite intriguing. I thought it was
> better to set it 100mb*#cores, so in my case I set it to 200 and now I
> should set it to 400. Do larger values help without being harmful?
>
> Regards,
>
> Paolo
>
>
> Il 27/05/2016 03:05, Mike Yeap ha scritto:
>
> Hi Paolo,
>
> a) was there any large insertion done?
> b) are the a lot of files in the saved_caches directory?
> c) would you consider to increase the HEAP_NEWSIZE to, say, 1200M?
>
>
> Regards,
> Mike Yeap
>
> On Fri, May 27, 2016 at 12:39 AM, Paolo Crosato <
> paolo.cros...@targaubiest.com> wrote:
>
>> Hi,
>>
>> we are running a cluster of 4 nodes, each one has the same sizing: 2
>> cores, 16G ram and 1TB of disk space.
>>
>> On every node we are running cassandra 2.0.17, oracle java version
>> "1.7.0_45", centos 6 with this kernel version 2.6.32-431.17.1.el6.x86_64
>>
>> Two nodes are running just fine, the other two have started to go OOM at
>> every start.
>>
>> This is the error we get:
>>
>> INFO [ScheduledTasks:1] 2016-05-26 18:15:58,460 StatusLogger.java (line
>> 70) ReadRepairStage   0 0116
>> 0 0
>>  INFO [ScheduledTasks:1] 2016-05-26 18:15:58,462 StatusLogger.java (line
>> 70) MutationStage31  1369  20526
>> 0 0
>>  INFO [ScheduledTasks:1] 2016-05-26 18:15:58,590 StatusLogger.java (line
>> 70) ReplicateOnWriteStage 0 0  0
>> 0 0
>>  INFO [ScheduledTasks:1] 2016-05-26 18:15:58,591 StatusLogger.java (line
>> 70) GossipStage   0 0335
>> 0 0
>>  INFO [ScheduledTasks:1] 2016-05-26 18:16:04,195 StatusLogger.java (line
>> 70) CacheCleanupExecutor  0 0  0
>> 0 0
>>  INFO [ScheduledTasks:1] 2016-05-26 18:16:06,526 StatusLogger.java (line
>> 70) MigrationStage0 0  0
>> 0 0
>>  INFO [ScheduledTasks:1] 2016-05-26 18:16:06,527 StatusLogger.java (line
>> 70) MemoryMeter   1 4 26
>> 0 0
>>  INFO [ScheduledTasks:1] 2016-05-26 18:16:06,527 StatusLogger.java (line
>> 70) ValidationExecutor0 0  0
>> 0 0
>> DEBUG [MessagingService-Outgoing-/10.255.235.19] 2016-05-26 18:16:06,518
>> OutboundTcpConnection.java (line 290) attempting to connect to /
>> 10.255.235.19
>>  INFO [GossipTasks:1] 2016-05-26 18:16:22,912 Gossiper.java (line 992)
>> InetAddress /10.255.235.28 is now DOWN
>>  INFO [ScheduledTasks:1] 2016-05-26 18:16:22,952 StatusLogger.java (line
>> 70) FlushWriter   1 5 47
>> 025
>>  INFO [ScheduledTasks:1] 2016-05-26 18:16:22,953 StatusLogger.java (line
>> 70) InternalResponseStage 0 0  0
>> 0 0
>> ERROR [ReadStage:27] 2016-05-26 18:16:29,250 CassandraDaemon.java (line
>> 258) Exception in thread Thread[ReadStage:27,5,main]
>> java.lang.OutOfMemoryError: Java heap space
>> at
>> org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:347)
>> at
>> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392)
>> at
>> org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:355)
>> at
>> org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:124)
>> at
>> org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:85)
>> at org.apache.cassandra.db.Column$1.computeNext(Column.java:75)
>> at org.apache.cassandra.db.Column$1.computeNext(Column.java:64)
>> at
>> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>> at
>> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>> at
>> com.google.common.collect.AbstractIterator.next(AbstractIterator.java:153)
>> at
>>

Re: Setting bloom_filter_fp_chance < 0.01

2016-05-19 Thread Kai Wang

with 50 bln rows and bloom_filter_fp_chance = 0.01, bloom filter will
consume a lot of off heap memory. You may want to take that into
consideration too.

On Wed, May 18, 2016 at 11:53 PM, Adarsh Kumar  wrote:

> Hi Sai,
>
> We have a use case where we are designing a table that is going to have
> around 50 billion rows and we require a very fast reads. Partitions are not
> that complex/big, it has
> some validation data for duplicate checks (consisting 4-5 int and
> varchar). So we were trying various options to optimize read performance.
> Apart from tuning Bloom Filter we are trying following thing:
>
> 1). Better data modelling (making appropriate partition and clustering
> keys)
> 2). Trying Leveled compaction (changing data model for this one)
>
> Jonathan,
>
> I understand that tuning bloom_filter_fp_chance will not have a drastic
> performance gain.
> But this is one of the many tings we are trying.
> Please let me know if you have any other suggestions to improve read
> performance for this volume of data.
>
> Also please let me know any performance benchmark technique (currently we
> are planing to trigger massive reads from spark and check cfstats).
>
> NOTE: we will be deploying DSE on EC2, so please suggest if you have
> anything specific to DSE and EC2.
>
> Adarsh
>
> On Wed, May 18, 2016 at 9:45 PM, Jonathan Haddad 
> wrote:
>
>> The impact is it'll get massively bigger with very little performance
>> benefit, if any.
>>
>> You can't get 0 because it's a probabilistic data structure.  It tells
>> you either:
>>
>> your data is definitely not here
>> your data has a pretty decent chance of being here
>>
>> but never "it's here for sure"
>>
>> https://en.wikipedia.org/wiki/Bloom_filter
>>
>> On Wed, May 18, 2016 at 11:04 AM sai krishnam raju potturi <
>> pskraj...@gmail.com> wrote:
>>
>>> hi Adarsh;
>>> were there any drawbacks to setting the bloom_filter_fp_chance  to
>>> the default value?
>>>
>>> thanks
>>> Sai
>>>
>>> On Wed, May 18, 2016 at 2:21 AM, Adarsh Kumar 
>>> wrote:
>>>
 Hi,

 What is the impact of setting bloom_filter_fp_chance < 0.01.

 During performance tuning I was trying to tune bloom_filter_fp_chance
 and have following questions:

 1). Why bloom_filter_fp_chance = 0 is not allowed. (
 https://issues.apache.org/jira/browse/CASSANDRA-5013)
 2). What is the maximum/recommended value of bloom_filter_fp_chance (if
 we do not have any limitation for bloom filter size).

 NOTE: We are using default SizeTieredCompactionStrategy on
 cassandra  2.1.8.621

 Thanks in advance..:)

 Adarsh Kumar

>>>
>>>
>

Bloom filter memory usage disparity

2016-05-03 Thread Kai Wang

Hi,

I have a table on 3-node cluster. I notice bloom filter memory usage are
very different on one of the node. For a given table, I checked
CassandraMetricsRegistry$JmxGauge.[table]_BloomFilterOffHeapMemoryUsed.Value.
2 of 3 nodes show 1.5GB while the other shows 2.5 GB.

What could be the reason?

That table is using LCS.
bloom_filter_fp_chance=0.1
That table has about 16M keys and 140GB of data.

Thanks.

Re: Cassandra table limitation

2016-04-05 Thread Kai Wang

With small data size and unknown access pattern, any particular reason to
choose C*? It sounds like a relational database fits better.

On Tue, Apr 5, 2016 at 11:40 PM, jason zhao yang <
zhaoyangsingap...@gmail.com> wrote:

> Hi Jack,
>
> Thanks for the reply.
>
> Each tenant will has around 50-100 tables for their applications. probably
> log collection, probably account table, it's not fixed and depends on
> tenants' need.
>
> There will be a team in charge of helping tenant to do data modeling and
> access patterns. Tenants will not directly admin on the cluster, we will
> take care.
>
> Yes, multi-cluster is a solution. But the cost will be quite high, because
> each tenant's data is far less than the capacity of a 3 node cluster. So I
> want to put multiple tenants into one clusters.
>
>
>
> Jack Krupansky <jack.krupan...@gmail.com>于2016年4月6日周三 上午10:41写道：
>
>> What is the nature of these tenants? Are they each creating their own
>> data models? Is there one central authority that will approve of all data
>> models and who can adjust the cluster configuration to support those models?
>>
>> Generally speaking, multi-tenancy is an anti-pattern for Cassandra and
>> for most servers. The proper way to do multitenancy is to not do it at all,
>> and to use separate machines or at least separate virtual machines.
>>
>> In particular, there needs to be a central authority managing a Cassandra
>> cluster to assure its smooth operation. If each tenant is going in their
>> own directions, then nobody will be in charge and capable of assuring that
>> everybody is on the same page.
>>
>> Again, it depends on the nature of these tenants and how much control the
>> cluster administrator has over them.
>>
>> Think of a Cassandra cluster as managing the data for either a single
>> application or a collection of applications which share the same data. If
>> there are multiple applications that don't share the same data, then they
>> absolutely should be on separate clusters.
>>
>>
>> -- Jack Krupansky
>>
>> On Tue, Apr 5, 2016 at 5:40 PM, Kai Wang <dep...@gmail.com> wrote:
>>
>>> Once a while the question about table count rises in this list. The most
>>> recent is
>>> https://groups.google.com/forum/#!topic/nosql-databases/IblAhiLUXdk
>>>
>>> In short C* is not designed to scale with the table count. For one each
>>> table/CF has some fixed memory footprint on *ALL* nodes. The consensus is
>>> you shouldn't have more than "a few hundreds" of tables.
>>>
>>> On Mon, Apr 4, 2016 at 10:17 AM, jason zhao yang <
>>> zhaoyangsingap...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> This is Jason.
>>>>
>>>> Currently, I am using C* 2.1.10, I want to ask what's the optimal
>>>> number of tables I should create in one cluster?
>>>>
>>>> My use case is that I will prepare a keyspace for each of my tenant,
>>>> and every tenant will create tables they needed. Assume each tenant created
>>>> 50 tables with normal workload (half read, half write).   so how many
>>>> number of tenants I can support in one cluster?
>>>>
>>>> I know there are a few issues related to large number of tables.
>>>> * frequent GC
>>>> * frequent flush due to insufficient memory
>>>> * large latency when modifying table schema
>>>> * large amount of tombstones during creating table
>>>>
>>>> Is there any other issues with large number of tables? Using a 32GB
>>>> instance, I can easily create 4000 tables with off-heap-memtable.
>>>>
>>>> BTW, Is this table limitation solved in 3.X?
>>>>
>>>> Thank you very much.
>>>>
>>>>
>>>
>>

Re: Cassandra table limitation

2016-04-05 Thread Kai Wang

Once a while the question about table count rises in this list. The most
recent is
https://groups.google.com/forum/#!topic/nosql-databases/IblAhiLUXdk

In short C* is not designed to scale with the table count. For one each
table/CF has some fixed memory footprint on *ALL* nodes. The consensus is
you shouldn't have more than "a few hundreds" of tables.

On Mon, Apr 4, 2016 at 10:17 AM, jason zhao yang <
zhaoyangsingap...@gmail.com> wrote:

> Hi,
>
> This is Jason.
>
> Currently, I am using C* 2.1.10, I want to ask what's the optimal number
> of tables I should create in one cluster?
>
> My use case is that I will prepare a keyspace for each of my tenant, and
> every tenant will create tables they needed. Assume each tenant created 50
> tables with normal workload (half read, half write).   so how many number
> of tenants I can support in one cluster?
>
> I know there are a few issues related to large number of tables.
> * frequent GC
> * frequent flush due to insufficient memory
> * large latency when modifying table schema
> * large amount of tombstones during creating table
>
> Is there any other issues with large number of tables? Using a 32GB
> instance, I can easily create 4000 tables with off-heap-memtable.
>
> BTW, Is this table limitation solved in 3.X?
>
> Thank you very much.
>
>

Re: Inconsistent query results and node state

2016-03-30 Thread Kai Wang

Do you have NTP setup on all nodes?

On Tue, Mar 29, 2016 at 11:48 PM, Jason Kania  wrote:

> We have encountered a query inconsistency problem wherein the following
> query returns different results sporadically with invalid values for a
> timestamp field looking like the field is uninitialized (a zero timestamp)
> in the query results.
>
> Attempts to repair and compact have not changed the results.
>
> select "subscriberId","sensorUnitId","sensorId","time" from
> "sensorReadingIndex" where "subscriberId"='JASKAN' AND "sensorUnitId"=0 AND
> "sensorId"=0 ORDER BY "time" LIMIT 10;
>
> Invalid Query Results
> subscriberIdsensorUnitIdsensorIdtime
> JASKAN002015-05-24 2:09
> JASKAN00*1969-12-31 19:00*
> JASKAN002016-01-21 2:10
> JASKAN002016-01-21 2:10
> JASKAN002016-01-21 2:10
> JASKAN002016-01-21 2:11
> JASKAN002016-01-21 2:22
> JASKAN002016-01-21 2:22
> JASKAN002016-01-21 2:22
> JASKAN002016-01-21 2:22
>
> Valid Query Results
> subscriberIdsensorUnitIdsensorIdtime
> JASKAN002015-05-24 2:09
> JASKAN002015-05-24 2:09
> JASKAN002015-05-24 2:10
> JASKAN002015-05-24 2:10
> JASKAN002015-05-24 2:10
> JASKAN002015-05-24 2:10
> JASKAN002015-05-24 2:11
> JASKAN002015-05-24 2:13
> JASKAN002015-05-24 2:13
> JASKAN002015-05-24 2:14
>
> We have confirmed that the 1969-12-31 timestamp is not within the data
> based on running and number of queries so it looks like the invalid
> timestamp value is generated by the query. The query below returns no row.
>
> select * from "sensorReadingIndex" where "subscriberId"='JASKAN' AND
> "sensorUnitId"=0 AND "sensorId"=0 AND time='1969-12-31 19:00:00-0500';
>
> No logs are coming out but the following was observed intermittently in
> the tracing output, but not correlated to the invalid query results:
>
>  Digest mismatch: org.apache.cassandra.service.DigestMismatchException:
> Mismatch for key DecoratedKey(-7563144029910940626,
> 00064a41534b414e040400)
> (be22d379c18f75c2f51dd6942d2f9356 vs da4e95d571b41303b908e0c5c3fff7ba)
> [ReadRepairStage:3179] | 2016-03-29 23:12:35.025000 | 192.168.10.10 |
>
> An error from the debug log that might be related is:
>
> org.apache.cassandra.service.DigestMismatchException: Mismatch for key
> DecoratedKey(-4908797801227889951, 4a41534b414e)
> (6a6c8ab013d7757e702af50cbdae045c vs 2ece61a01b2a640ac10509f4c49ae6fb)
> at
> org.apache.cassandra.service.DigestResolver.resolve(DigestResolver.java:85)
> ~[apache-cassandra-3.0.3.jar:3.0.3]
> at
> org.apache.cassandra.service.ReadCallback$AsyncRepairRunner.run(ReadCallback.java:225)
> ~[apache-cassandra-3.0.3.jar:3.0.3]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [na:1.8.0_74]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [na:1.8.0_74]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_74]
>
> The tracing files are attached and seem to show that in the failed case,
> content is skipped because of tombstones if we understand it correctly.
> This could be an inconsistency problem on 192.168.10.9 Unfortunately,
> attempts to compact on 192.168.10.9 only give the following error without
> any stack trace detail and are not fixed with repair.
>
> root@cutthroat:/usr/local/bin/analyzer/bin# nodetool compact
> error: null
> -- StackTrace --
> java.lang.ArrayIndexOutOfBoundsException
>
> Any suggestions on how to fix or what to search for would be much
> appreciated.
>
> Thanks,
>
> Jason
>
>
>
>

Re: Acceptable repair time

2016-03-29 Thread Kai Wang

IIRC when we switched to LCS and ran the first full repair with 250GB/RF=3,
it took at least 12 hours for the repair to finish, then another 3+ days
for all the compaction to catch up. I called it "the big bang of LCS".

Since then we've been running nightly incremental repair.

For me as long as it's reliable (no streaming error, better progress
reporting etc), I actually don't mind it it takes more than a few hours to
do a full repair. But I am not sure about 4 days... I guess it depends on
the size of the cluster and data...

On Tue, Mar 29, 2016 at 6:04 AM, Anishek Agarwal  wrote:

> I would really like to know the answer for above because on some nodes
> repair takes almost 4 days for us :(.
>
> On Tue, Mar 29, 2016 at 8:34 AM, Jack Krupansky 
> wrote:
>
>> Someone recently asked me for advice when their repair time was 2-3 days.
>> I thought that was outrageous, but not unheard of. Personally, to me, 2-3
>> hours would be about the limit of what I could tolerate, and my personal
>> goal would be that a full repair of a node should take no longer than an
>> hour, maybe 90 minutes tops. But... achieving those more abbreviated repair
>> times would strongly suggest that the amount of data on each node be kept
>> down to a tiny fraction of a typical spinning disk drive, or even a
>> fraction of a larger SSD drive.
>>
>> So, my question here is what people consider acceptable full repair times
>> for nodes and what the resulting node data size is.
>>
>> What impact vnodes has on these numbers is a bonus question.
>>
>> Thanks!
>>
>> -- Jack Krupansky
>>
>
>

Re: Query regarding CassandraJavaRDD while running spark job on cassandra

2016-03-24 Thread Kai Wang

I suggest you post this to spark-cassandra-connector list.

On Sat, Mar 12, 2016 at 12:52 AM, Siddharth Verma <
verma.siddha...@snapdeal.com> wrote:

> In cassandra I have a table with the following schema.
>
> CREATE TABLE my_keyspace.my_table1 (
> col_1 text,
> col_2 text,
> col_3 text,
> col_4 text,,
> col_5 text,
> col_6 text,
> col_7 text,
> PRIMARY KEY (col_1, col_2, col_3)
> ) WITH CLUSTERING ORDER BY (col_2 ASC, col_3 ASC);
>
> For processing I create a spark job.
>
> CassandraJavaRDD data1 =
> function.cassandraTable("my_keyspace", "my_table1")
>
>
> 1. Does it guarantee mutual exclusivity of fetched rows across all RDDs
> which are on worker nodes?
> (At the cost of redundancy and verbosity, I will reiterate.
> Suppose I have an entry in the table : ('1','2','3','4','5','6','7')
> What I mean to ask is, when I perform transformations/actions on data1
> RDD), can I be sure that the above entry will be present on ONLY ONE worker
> node?)
>
> 2. All the data pertaining to one partition will be on one node?
> (Suppose I have the following entries in the table :
> ('p1','c2_1','c3_1','4','5','6','7')
> ('p1','c2_2','c3'_2,'4','5','6','7')
> ('p1','c2_3','c3_3','4','5','6','7')
> ('p1','c2_4','c3_4','4','5','6','7')
> ('p1' )
> ('p1' )
> ('p1' )
> All the data for the same partition will be present on only one node?
> )
>
> 3. If i have a DC specifically for analytics, and I place the spark worker
> on the same machines as cassandra node, for that entire DC.
> Can I make sure that the spark worker fetches the data from the token
> range present on that node? (I.E. the node does't fetch data present on
> different node)
> 3.1 (as with the above statement which doesn't have a 'where' clause).
> 3.2 (as with the above statement which has a 'where' clause).
>

Re: Rows with same key

2016-02-11 Thread Kai Wang

Are you supplying timestamps from the client side? Are clocks in sync cross
your nodes?


On Thu, Feb 11, 2016 at 11:52 AM, Yulian Oifa  wrote:

> Hello to all
> I have multiple rows with same id on one of cfs, one row is completely
> empty ,another one has vaues.
> Values are wrotten into new row , however they are retreived from old
> row...
> I guess one row is created due to removed values, and stucked somehow.
> I am trying to remove it with no luck ( compact , flush , repair , etc ).
> I have set gc grace to this CF , however i beleive the old row has old
> value.
> How can i get rid of this row?
> Best regards
> Yulian Oifa
>

Re: 3k sstables during a repair incremental !!

2016-02-10 Thread Kai Wang

Jean,

What does your cfstats look like? Especially "SSTables in each level" line.

On Wed, Feb 10, 2016 at 8:33 AM, Jean Carlo 
wrote:

> Hello guys!
>
> I am testing the repair inc in my custer cassandra. I am doing my test
> over these tables
>
> *CREATE TABLE pns_nonreg_bench.cf3* (
> s text,
> sp int,
> d text,
> dp int,
> m map,
> t timestamp,
> PRIMARY KEY (s, sp, d, dp)
> ) WITH CLUSTERING ORDER BY (sp ASC, d ASC, dp ASC)
>
> AND compaction = {'class': 'org.apache.cassandra.db.compaction.
> *LeveledCompactionStrategy*'}
> AND compression = {'sstable_compression':
> 'org.apache.cassandra.io.compress.*SnappyCompressor*'}
>
> *CREATE TABLE pns_nonreg_bench.cf1* (
> ise text PRIMARY KEY,
> int_col int,
> text_col text,
> ts_col timestamp,
> uuid_col uuid
> ) WITH bloom_filter_fp_chance = 0.01
>  AND compaction = {'class': 'org.apache.cassandra.db.compaction.
> *LeveledCompactionStrategy*'}
> AND compression = {'sstable_compression':
> 'org.apache.cassandra.io.compress.*SnappyCompressor*'}
>
>
>
> *table cf1Space used (live): 665.7 MB*
>
> *table cf2*
> *Space used (live): 697.03 MB*
>
> It happens that when I do repair -inc -par on theses tables, *cf2 got a
> pick of 3k sstables*. When the repair finish, it takes 30 min or more to
> finish all the compactations and return to 6 sstable.
>
> I am a little concern about if this will happen on production. is it
> normal?
>
> Saludos
>
> Jean Carlo
>
> "The best way to predict the future is to invent it" Alan Kay
>

Re: missing rows while importing data using sstable loader

2016-01-29 Thread Kai Wang

Arindam,

what's the table schema and what does your query to retrieve the rows look
like?

On Fri, Jan 29, 2016 at 7:33 AM, Arindam Choudhury <
arindam.choudh...@ackstorm.com> wrote:

> Hi,
>
> I am importing data to a new cassandra cluster using sstableloader. The
> sstableloader runs without any warning or error. But I am missing around
> 1000 rows.
>
> Any feedback will be highly appreciated.
>
> Kind Regards,
> Arindam Choudhury
>

Re: Detailed info on how inter dc rep works

2016-01-28 Thread Kai Wang

John,

There was a thread last month about this topic.

https://mail-archives.apache.org/mod_mbox/incubator-cassandra-user/201512.mbox/%3CCABWW=xw9obk+w-4efpymnpo_fy8dbilbgv2fk-9xre7ydy2...@mail.gmail.com%3E



On Thu, Jan 28, 2016 at 7:51 PM, John Lonergan 
wrote:

> If I have a single client publishing to a cluster with replication to a
> second cluster in another dc, then do the changes become visible in the
> second dc in the same order that they became visible in the first dc?
>
>

Re: Cassandra 2015 Summit videos

2016-01-23 Thread Kai Wang

Check out https://vimeopro.com/user35188327/cassandra-summit-2015.

Although this list is about Cassandra not Datastax. I still want to comment
a little bit about 2015 summit videos. I prefer this format:
https://www.youtube.com/user/PlanetCassandra/playlists but I don't know why
datastax stops doing it on youtube Maybe it's on youtube
somewhere but I can't find it. viemo's platform is not as well organized
and the playback performance is not good on some mobile devices. I watched
80% of the videos in 2014 but only a hand full in 2015. Such a pity.

On Sat, Jan 23, 2016 at 1:41 PM, Jan  wrote:

> HI Folks
>
> could you please point me to the  *2015 *Cassandra summit held in
> California.
> I do see the ones posted for the 2014 & 2013 conferences.
>
> Thanks
> Jan
>

Re: compaction throughput

2016-01-21 Thread Kai Wang

I am using 2.2.4 and have seen multiple compactors running on the same
table. The number of compactors seems to be controlled by
concurrent_compactors. As of type of compactions, I've seen normal
compaction, tombstone compaction. Validation and Anticompaction seem to
always be single threaded.

On Thu, Jan 21, 2016 at 8:28 AM, PenguinWhispererThe . <
th3penguinwhispe...@gmail.com> wrote:

> Thanks for that clarification Sebastian! That's really good to know! I
> never took increasing this value in consideration because of my previous
> experience.
>
> In my case I had a table that was compacting over and over... and only one
> CPU was used. So that made me believe it was not multithreaded (I actually
> believe I asked this on IRC however it's been a few months ago so I might
> be wrong).
>
> Have there been behavioral changes on this lately? (I was using 2.0.9 or
> 2.0.11 I believe).
>
> 2016-01-21 14:15 GMT+01:00 Sebastian Estevez <
> sebastian.este...@datastax.com>:
>
>> >So compaction of one table will NOT spread over different cores.
>>
>> This is not exactly true. You actually can have multiple compactions
>> running at the same time on the same table, it just doesn't happen all that
>> often. You essentially would have to have two sets of sstables that are
>> both eligible for compactions at the same time.
>>
>> all the best,
>>
>> Sebastián
>> On Jan 21, 2016 7:41 AM, "PenguinWhispererThe ." <
>> th3penguinwhispe...@gmail.com> wrote:
>>
>>> After having some issues myself with compaction I think it's noteworthy
>>> to explicitly state that compaction of a table can only run on one CPU. So
>>> compaction of one table will NOT spread over different cores.
>>> To really have use of concurrent_compactors you need to have multiple
>>> table compactions initiated at the same time. If those are small they'll
>>> finish way earlier resulting in only one core using 100% as compaction is
>>> generally CPU bound (unless your disks can't keep up).
>>> I believe it's better to be CPU(core) bound on one core(or at least not
>>> all) for compaction than disk IO bound as this would result in writes and
>>> reads, ... having performance impact.
>>> Compaction is a maintenance task so it shouldn't be eating all your
>>> resources.
>>>
>>>
>>> <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail>
>>>  This
>>> email has been sent from a virus-free computer protected by Avast.
>>> www.avast.com
>>> <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail>
>>> <#-2069969251_1162782367_-1582318301_DDB4FAA8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>>
>>> 2016-01-16 0:18 GMT+01:00 Kai Wang <dep...@gmail.com>:
>>>
>>>> Jeff & Sebastian,
>>>>
>>>> Thanks for the reply. There are 12 cores but in my case C* only uses
>>>> one core most of the time. *nodetool compactionstats* shows there's
>>>> only one compactor running. I can see C* process only uses one core. So I
>>>> guess I should've asked the question more clearly:
>>>>
>>>> 1. Is ~25 M/s a reasonable compaction throughput for one core?
>>>> 2. Is there any configuration that affects single core compaction
>>>> throughput?
>>>> 3. Is concurrent_compactors the only option to parallelize compaction?
>>>> If so, I guess it's the compaction strategy itself that decides when to
>>>> parallelize and when to block on one core. Then there's not much we can do
>>>> here.
>>>>
>>>> Thanks.
>>>>
>>>> On Fri, Jan 15, 2016 at 5:23 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com
>>>> > wrote:
>>>>
>>>>> With SSDs, the typical recommendation is up to 0.8-1 compactor per
>>>>> core (depending on other load).  How many CPU cores do you have?
>>>>>
>>>>>
>>>>> From: Kai Wang
>>>>> Reply-To: "user@cassandra.apache.org"
>>>>> Date: Friday, January 15, 2016 at 12:53 PM
>>>>> To: "user@cassandra.apache.org"
>>>>> Subject: compaction throughput
>>>>>
>>>>> Hi,
>>>>>
>>>>> I am trying to figure out the bottleneck of compaction on my node. The
>>>>> node is CentOS 7 and has SSDs installed. The table is configured to use
>>>>> LCS. Here is my compa

Re: endless full gc on one node

2016-01-17 Thread Kai Wang

DuyHai,

In this case I didn't use batch, just bind a single PreparedStatement and
execute. Nor did I see any warning/error about batch being too large in the
log.

Thanks.

On Sat, Jan 16, 2016 at 6:27 PM, DuyHai Doan <doanduy...@gmail.com> wrote:

> "As soon as inserting started, one node started non-stop full GC. The
> other two nodes were totally fine"
>
> Just a guest, how did you insert data ? Did you use Batch statements ?
>
> On Sat, Jan 16, 2016 at 10:12 PM, Kai Wang <dep...@gmail.com> wrote:
>
>> Hi,
>>
>> Recently I saw some strange behavior on one of the nodes of a 3-node
>> cluster. A while ago I created a table and put some data (about 150M) in it
>> for testing. A few days ago I started to import full data into that table
>> using normal cql INSERT statements. As soon as inserting started, one node
>> started non-stop full GC. The other two nodes were totally fine. I stopped
>> the inserting process, restarted C* on all the nodes. All nodes are fine.
>> But once I started inserting again, full GC kicked in on that node within a
>> minute.The insertion speed is moderate. Again, the other two nodes were
>> fine. I tried this process a couple of times. Every time the same node
>> jumped into full GC. I even rebooted all the boxes. I checked system.log
>> but found no errors or warnings before full GC started.
>>
>> Finally I deleted and recreated the table. All of sudden the problem went
>> away. The only thing I can think of is that table was created using STCS.
>> After I inserted 150M data into it, I switched it to LCS. Then I ran
>> incremental repair a couple of times. I saw validation and normal
>> compaction on that table as expected. When I recreated the table, I created
>> it with LCS.
>>
>> I don't have the problem any more but just want to share the experience.
>> Maybe someone has an theory on this? BTW I am running C* 2.2.4 with CentOS
>> 7 and Java 8. All boxes have the identical configurations.
>>
>> Thanks.
>>
>
>

Re: In UJ status for over a week trying to rejoin cluster in Cassandra 3.0.1

2016-01-17 Thread Kai Wang

Carlos,

so you essentially replace the 33 node. Did you follow this
https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_replace_node_t.html?
The link is for 2.x not sure about 3.x. What if you change the new node to
.34?



On Mon, Jan 11, 2016 at 12:57 AM, Carlos A  wrote:

> Hello all,
>
> I have a small dev environment with 4 machines. One of them, I had it
> removed (.33) from the cluster because I wanted to upgrade its HD to a SSD.
> I then reinstalled it and tried to join. It is on UJ status for a week now
> and no changes.
>
> I had tried node-repair etc but nothing.
>
> nodetool status output
>
> Datacenter: DC1
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address   Load   Tokens   OwnsHost ID
>   Rack
> UN  192.168.1.30  16.13 MB   256  ?
> 0e524b1c-b254-45d0-98ee-63b8f34a8531  RAC1
> UN  192.168.1.31  20.12 MB   256  ?
> 1f8000f5-026c-42c7-8189-cf19fbede566  RAC1
> UN  192.168.1.32  17.73 MB   256  ?
> 7b06f9e9-7c41-4364-ab18-f6976fd359e4  RAC1
> UJ  192.168.1.33  877.6 KB   256  ?
> 7a1507b5-198e-4a3a-a9fd-7af9e588fde2  RAC1
>
> Note: Non-system keyspaces don't have the same replication settings,
> effective ownership information is meaningless
>
> Any tips on fixing this?
>
> Thanks,
>
> C.
>

Re: New node has high network and disk usage.

2016-01-17 Thread Kai Wang

James,

Thanks for sharing. Anyway, good to know there's one more thing to add to
the checklist.

On Sun, Jan 17, 2016 at 12:23 PM, James Griffin <
james.grif...@idioplatform.com> wrote:

> Hi all,
>
> Just to let you know, we finally figured this out on Friday. It turns out
> the new nodes had an older version of the kernel installed. Upgrading the
> kernel solved our issues. For reference, the "bad" kernel was
> 3.2.0-75-virtual, upgrading to 3.2.0-86-virtual resolved the issue. We
> still don't fully understand why this kernel bug didn't affect *all *our
> nodes (in the end we had three nodes with that kernel, only two of them
> exhibited this issue), but there we go.
>
> Thanks everyone for your help
>
> Cheers,
> Griff
>
> On 14 January 2016 at 15:14, James Griffin <james.grif...@idioplatform.com
> > wrote:
>
>> Hi Kai,
>>
>> Well observed - running `nodetool status` without specifying keyspace
>> does report ~33% on each node. We have two keyspaces on this cluster - if I
>> specify either of them the ownership reported by each node is 100%, so I
>> believe the repair completed successfully.
>>
>> Best wishes,
>>
>> Griff
>>
>> [image: idioplatform] <http://idioplatform.com/>James "Griff" Griffin
>> CTO
>> Switchboard: +44 (0)20 3540 1920 | Direct: +44 (0)7763 139 206 |
>> Twitter: @imaginaryroots <http://twitter.com/imaginaryroots> | Skype:
>> j.s.griffin
>> idio helps major brands and publishers to build closer relationships with
>> their customers and prospects by learning from their content consumption
>> and acting on that insight. We call it Content Intelligence, and it
>> integrates with your existing marketing technology to provide detailed
>> customer interest profiles in real-time across all channels, and to
>> personalize content into every channel for every customer. See
>> http://idioplatform.com
>> <https://t.yesware.com/tl/0e637e4938676b6f3897def79d0810a71e59612e/10068de2036c2daf922e0a879bb2fe92/9dae8be0f7693bf2b28a88cc4b38c554?ytl=http%3A%2F%2Fidioplatform.com%2F>
>>  for
>> more information.
>>
>> On 14 January 2016 at 15:08, Kai Wang <dep...@gmail.com> wrote:
>>
>>> James,
>>>
>>> I may miss something. You mentioned your cluster had RF=3. Then why
>>> does "nodetool status" show each node owns 1/3 of the data especially after
>>> a full repair?
>>>
>>> On Thu, Jan 14, 2016 at 9:56 AM, James Griffin <
>>> james.grif...@idioplatform.com> wrote:
>>>
>>>> Hi Kai,
>>>>
>>>> Below - nothing going on that I can see
>>>>
>>>> $ nodetool netstats
>>>> Mode: NORMAL
>>>> Not sending any streams.
>>>> Read Repair Statistics:
>>>> Attempted: 0
>>>> Mismatch (Blocking): 0
>>>> Mismatch (Background): 0
>>>> Pool NameActive   Pending  Completed
>>>> Commandsn/a 0   6326
>>>> Responses   n/a 0 219356
>>>>
>>>>
>>>>
>>>> Best wishes,
>>>>
>>>> Griff
>>>>
>>>> [image: idioplatform] <http://idioplatform.com/>James "Griff" Griffin
>>>> CTO
>>>> Switchboard: +44 (0)20 3540 1920 | Direct: +44 (0)7763 139 206 |
>>>> Twitter: @imaginaryroots <http://twitter.com/imaginaryroots> | Skype:
>>>> j.s.griffin
>>>> idio helps major brands and publishers to build closer relationships
>>>> with their customers and prospects by learning from their content
>>>> consumption and acting on that insight. We call it Content Intelligence,
>>>> and it integrates with your existing marketing technology to provide
>>>> detailed customer interest profiles in real-time across all channels, and
>>>> to personalize content into every channel for every customer. See
>>>> http://idioplatform.com
>>>> <https://t.yesware.com/tl/0e637e4938676b6f3897def79d0810a71e59612e/10068de2036c2daf922e0a879bb2fe92/9dae8be0f7693bf2b28a88cc4b38c554?ytl=http%3A%2F%2Fidioplatform.com%2F>
>>>>  for
>>>> more information.
>>>>
>>>> On 14 January 2016 at 14:22, Kai Wang <dep...@gmail.com> wrote:
>>>>
>>>>> James,
>>>>>
>>>>> Can you post the result of "nodetool netstats" on the bad node?
>>>>>
>>&g

endless full gc on one node

2016-01-16 Thread Kai Wang

Hi,

Recently I saw some strange behavior on one of the nodes of a 3-node
cluster. A while ago I created a table and put some data (about 150M) in it
for testing. A few days ago I started to import full data into that table
using normal cql INSERT statements. As soon as inserting started, one node
started non-stop full GC. The other two nodes were totally fine. I stopped
the inserting process, restarted C* on all the nodes. All nodes are fine.
But once I started inserting again, full GC kicked in on that node within a
minute.The insertion speed is moderate. Again, the other two nodes were
fine. I tried this process a couple of times. Every time the same node
jumped into full GC. I even rebooted all the boxes. I checked system.log
but found no errors or warnings before full GC started.

Finally I deleted and recreated the table. All of sudden the problem went
away. The only thing I can think of is that table was created using STCS.
After I inserted 150M data into it, I switched it to LCS. Then I ran
incremental repair a couple of times. I saw validation and normal
compaction on that table as expected. When I recreated the table, I created
it with LCS.

I don't have the problem any more but just want to share the experience.
Maybe someone has an theory on this? BTW I am running C* 2.2.4 with CentOS
7 and Java 8. All boxes have the identical configurations.

Thanks.

compaction throughput

2016-01-15 Thread Kai Wang

Hi,

I am trying to figure out the bottleneck of compaction on my node. The node
is CentOS 7 and has SSDs installed. The table is configured to use LCS.
Here is my compaction related configs in cassandra.yaml:

compaction_throughput_mb_per_sec: 160
concurrent_compactors: 4

I insert about 10G of data and start observing compaction.

*nodetool compaction* shows most of time there is one compaction. Sometimes
there are 3-4 (I suppose this is controlled by concurrent_compactors).
During the compaction, I see one CPU core is 100%. At that point, disk IO
is about 20-25 M/s write which is much lower than the disk is capable of.
Even when there are 4 compactions running, I see CPU go to +400% but disk
IO is still at 20-25M/s write. I use *nodetool setcompactionthroughput 0*
to disable the compaction throttle but don't see any difference.

Does this mean compaction is CPU bound? If so 20M/s is kinda low. Is there
anyway to improve the throughput?

Thanks.

Re: compaction throughput

2016-01-15 Thread Kai Wang

I forget to mention I am using C* 2.2.4
On Jan 15, 2016 3:53 PM, "Kai Wang" <dep...@gmail.com> wrote:

> Hi,
>
> I am trying to figure out the bottleneck of compaction on my node. The
> node is CentOS 7 and has SSDs installed. The table is configured to use
> LCS. Here is my compaction related configs in cassandra.yaml:
>
> compaction_throughput_mb_per_sec: 160
> concurrent_compactors: 4
>
> I insert about 10G of data and start observing compaction.
>
> *nodetool compaction* shows most of time there is one compaction.
> Sometimes there are 3-4 (I suppose this is controlled by
> concurrent_compactors). During the compaction, I see one CPU core is 100%.
> At that point, disk IO is about 20-25 M/s write which is much lower than
> the disk is capable of. Even when there are 4 compactions running, I see
> CPU go to +400% but disk IO is still at 20-25M/s write. I use *nodetool
> setcompactionthroughput 0* to disable the compaction throttle but don't
> see any difference.
>
> Does this mean compaction is CPU bound? If so 20M/s is kinda low. Is there
> anyway to improve the throughput?
>
> Thanks.
>

Re: compaction throughput

2016-01-15 Thread Kai Wang

Jeff & Sebastian,

Thanks for the reply. There are 12 cores but in my case C* only uses one
core most of the time. *nodetool compactionstats* shows there's only one
compactor running. I can see C* process only uses one core. So I guess I
should've asked the question more clearly:

1. Is ~25 M/s a reasonable compaction throughput for one core?
2. Is there any configuration that affects single core compaction
throughput?
3. Is concurrent_compactors the only option to parallelize compaction? If
so, I guess it's the compaction strategy itself that decides when to
parallelize and when to block on one core. Then there's not much we can do
here.

Thanks.

On Fri, Jan 15, 2016 at 5:23 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com>
wrote:

> With SSDs, the typical recommendation is up to 0.8-1 compactor per core
> (depending on other load).  How many CPU cores do you have?
>
>
> From: Kai Wang
> Reply-To: "user@cassandra.apache.org"
> Date: Friday, January 15, 2016 at 12:53 PM
> To: "user@cassandra.apache.org"
> Subject: compaction throughput
>
> Hi,
>
> I am trying to figure out the bottleneck of compaction on my node. The
> node is CentOS 7 and has SSDs installed. The table is configured to use
> LCS. Here is my compaction related configs in cassandra.yaml:
>
> compaction_throughput_mb_per_sec: 160
> concurrent_compactors: 4
>
> I insert about 10G of data and start observing compaction.
>
> *nodetool compaction* shows most of time there is one compaction.
> Sometimes there are 3-4 (I suppose this is controlled by
> concurrent_compactors). During the compaction, I see one CPU core is 100%.
> At that point, disk IO is about 20-25 M/s write which is much lower than
> the disk is capable of. Even when there are 4 compactions running, I see
> CPU go to +400% but disk IO is still at 20-25M/s write. I use *nodetool
> setcompactionthroughput 0* to disable the compaction throttle but don't
> see any difference.
>
> Does this mean compaction is CPU bound? If so 20M/s is kinda low. Is there
> anyway to improve the throughput?
>
> Thanks.
>

Re: compaction throughput

2016-01-15 Thread Kai Wang

Sebastian,

Because I have this impression that LCS is IO intensive and it's
recommended only on SSDs. So I am curious to see how far it can stress
those SSDs. But it turns out the most expensive part about LCS is not IO
bound but CUP bound, or more precisely single core speed bound. This is a
little surprising.

Of course LCS is still superior in other aspects.
On Jan 15, 2016 6:34 PM, "Sebastian Estevez" <sebastian.este...@datastax.com>
wrote:

> Correct.
>
> Why are you concerned with the raw throughput, are you accumulating
> pending compactions? Are you seeing high sstables per read statistics?
>
> all the best,
>
> Sebastián
> On Jan 15, 2016 6:18 PM, "Kai Wang" <dep...@gmail.com> wrote:
>
>> Jeff & Sebastian,
>>
>> Thanks for the reply. There are 12 cores but in my case C* only uses one
>> core most of the time. *nodetool compactionstats* shows there's only one
>> compactor running. I can see C* process only uses one core. So I guess I
>> should've asked the question more clearly:
>>
>> 1. Is ~25 M/s a reasonable compaction throughput for one core?
>> 2. Is there any configuration that affects single core compaction
>> throughput?
>> 3. Is concurrent_compactors the only option to parallelize compaction? If
>> so, I guess it's the compaction strategy itself that decides when to
>> parallelize and when to block on one core. Then there's not much we can do
>> here.
>>
>> Thanks.
>>
>> On Fri, Jan 15, 2016 at 5:23 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com>
>> wrote:
>>
>>> With SSDs, the typical recommendation is up to 0.8-1 compactor per core
>>> (depending on other load).  How many CPU cores do you have?
>>>
>>>
>>> From: Kai Wang
>>> Reply-To: "user@cassandra.apache.org"
>>> Date: Friday, January 15, 2016 at 12:53 PM
>>> To: "user@cassandra.apache.org"
>>> Subject: compaction throughput
>>>
>>> Hi,
>>>
>>> I am trying to figure out the bottleneck of compaction on my node. The
>>> node is CentOS 7 and has SSDs installed. The table is configured to use
>>> LCS. Here is my compaction related configs in cassandra.yaml:
>>>
>>> compaction_throughput_mb_per_sec: 160
>>> concurrent_compactors: 4
>>>
>>> I insert about 10G of data and start observing compaction.
>>>
>>> *nodetool compaction* shows most of time there is one compaction.
>>> Sometimes there are 3-4 (I suppose this is controlled by
>>> concurrent_compactors). During the compaction, I see one CPU core is 100%.
>>> At that point, disk IO is about 20-25 M/s write which is much lower than
>>> the disk is capable of. Even when there are 4 compactions running, I see
>>> CPU go to +400% but disk IO is still at 20-25M/s write. I use *nodetool
>>> setcompactionthroughput 0* to disable the compaction throttle but don't
>>> see any difference.
>>>
>>> Does this mean compaction is CPU bound? If so 20M/s is kinda low. Is
>>> there anyway to improve the throughput?
>>>
>>> Thanks.
>>>
>>
>>

Re: New node has high network and disk usage.

2016-01-14 Thread Kai Wang

James,

Can you post the result of "nodetool netstats" on the bad node?

On Thu, Jan 14, 2016 at 9:09 AM, James Griffin <
james.grif...@idioplatform.com> wrote:

> A summary of what we've done this morning:
>
>- Noted that there are no GCInspector lines in system.log on bad node
>(there are GCInspector logs on other healthy nodes)
>- Turned on GC logging, noted that we had logs which stated out total
>time for which application threads were stopped was high - ~10s.
>- Not seeing failures or any kind (promotion or concurrent mark)
>- Attached Visual VM: noted that heap usage was very low (~5% usage
>and stable) and it didn't display hallmarks GC of activity. PermGen also
>very stable
>- Downloaded GC logs and examined in GC Viewer. Noted that:
>- We had lots of pauses (again around 10s), but no full GC.
>   - From a 2,300s sample, just over 2,000s were spent with threads
>   paused
>   - Spotted many small GCs in the new space - realised that Xmn value
>   was very low (200M against a heap size of 3750M). Increased Xmn to 937M 
> -
>   no change in server behaviour (high load, high reads/s on disk, high CPU
>   wait)
>
> Current output of jstat:
>
>   S0 S1 E  O  P YGC YGCTFGCFGCT GCT
> 2 0.00  45.20  12.82  26.84  76.21   2333   63.684 20.039   63.724
> 3 63.58   0.00  33.68   8.04  75.19 141.812 20.1031.915
>
> Correct me if I'm wrong, but it seems 3 is lot more healthy GC wise than 2
> (which has normal load statistics).
>
> Anywhere else you can recommend we look?
>
> Griff
>
> On 14 January 2016 at 01:25, Anuj Wadehra  wrote:
>
>> Ok. I saw dropped mutations on your cluster and full gc is a common cause
>> for that.
>> Can you just search the word GCInspector in system.log and share the
>> frequency of minor and full gc. Moreover, are you printing promotion
>> failures in gc logs?? Why full gc ia getting triggered??promotion failures
>> or concurrent mode failures?
>>
>> If you are on CMS, you need to fine tune your heap options to address
>> full gc.
>>
>>
>>
>> Thanks
>> Anuj
>>
>> Sent from Yahoo Mail on Android
>> 
>>
>> On Thu, 14 Jan, 2016 at 12:57 am, James Griffin
>>  wrote:
>> I think I was incorrect in assuming GC wasn't an issue due to the lack of
>> logs. Comparing jstat output on nodes 2 & 3 show some fairly marked
>> differences, though
>> comparing the startup flags on the two machines show the GC config is
>> identical.:
>>
>> $ jstat -gcutil
>>S0 S1 E  O  P YGC YGCTFGCFGCT GCT
>> 2  5.08   0.00  55.72  18.24  59.90  25986  619.827281.597
>>  621.424
>> 3  0.00   0.00  22.79  17.87  59.99 422600 11225.979   668   57.383
>> 11283.361
>>
>> Here's typical output for iostat on nodes 2 & 3 as well:
>>
>> $ iostat -dmx md0
>>
>>   Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s
>> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
>> 2 md0   0.00 0.00  339.000.00 9.77 0.00
>>  59.00 0.000.000.000.00   0.00   0.00
>> 3 md0   0.00 0.00 2069.001.0085.85 0.00
>>  84.94 0.000.000.000.00   0.00   0.00
>>
>> Griff
>>
>> On 13 January 2016 at 18:36, Anuj Wadehra  wrote:
>>
>>> Node 2 has slightly higher data but that should be ok. Not sure how read
>>> ops are so high when no IO intensive activity such as repair and compaction
>>> is running on node 3.May be you can try investigating logs to see whats
>>> happening.
>>>
>>> Others on the mailing list could also share their views on the situation.
>>>
>>> Thanks
>>> Anuj
>>>
>>>
>>>
>>> Sent from Yahoo Mail on Android
>>> 
>>>
>>> On Wed, 13 Jan, 2016 at 11:46 pm, James Griffin
>>>  wrote:
>>> Hi Anuj,
>>>
>>> Below is the output of nodetool status. The nodes were replaced
>>> following the instructions in Datastax documentation for replacing running
>>> nodes since the nodes were running fine, it was that the servers had been
>>> incorrectly initialised and they thus had less disk space. The status below
>>> shows 2 has significantly higher load, however as I say 2 is operating
>>> normally and is running compactions, so I guess that's not an issue?
>>>
>>> Datacenter: datacenter1
>>> ===
>>> Status=Up/Down
>>> |/ State=Normal/Leaving/Joining/Moving
>>> --  Address Load   Tokens  Owns   Host ID
>>> Rack
>>> UN  1   253.59 GB  256 31.7%
>>>  6f0cfff2-babe-4de2-a1e3-6201228dee44  rack1
>>> UN  2   302.23 GB  256 35.3%
>>>  faa5b073-6af4-4c80-b280-e7fdd61924d3  rack1
>>> UN  3   265.02 GB  256 33.1%
>>>  74b15507-db5c-45df-81db-6e5bcb7438a3  rack1
>>>
>>> Griff

Re: New node has high network and disk usage.

2016-01-14 Thread Kai Wang

James,

I may miss something. You mentioned your cluster had RF=3. Then why does
"nodetool status" show each node owns 1/3 of the data especially after a
full repair?

On Thu, Jan 14, 2016 at 9:56 AM, James Griffin <
james.grif...@idioplatform.com> wrote:

> Hi Kai,
>
> Below - nothing going on that I can see
>
> $ nodetool netstats
> Mode: NORMAL
> Not sending any streams.
> Read Repair Statistics:
> Attempted: 0
> Mismatch (Blocking): 0
> Mismatch (Background): 0
> Pool NameActive   Pending  Completed
> Commandsn/a 0   6326
> Responses   n/a 0 219356
>
>
>
> Best wishes,
>
> Griff
>
> [image: idioplatform] <http://idioplatform.com/>James "Griff" Griffin
> CTO
> Switchboard: +44 (0)20 3540 1920 | Direct: +44 (0)7763 139 206 | Twitter:
> @imaginaryroots <http://twitter.com/imaginaryroots> | Skype: j.s.griffin
> idio helps major brands and publishers to build closer relationships with
> their customers and prospects by learning from their content consumption
> and acting on that insight. We call it Content Intelligence, and it
> integrates with your existing marketing technology to provide detailed
> customer interest profiles in real-time across all channels, and to
> personalize content into every channel for every customer. See
> http://idioplatform.com
> <https://t.yesware.com/tl/0e637e4938676b6f3897def79d0810a71e59612e/10068de2036c2daf922e0a879bb2fe92/9dae8be0f7693bf2b28a88cc4b38c554?ytl=http%3A%2F%2Fidioplatform.com%2F>
>  for
> more information.
>
> On 14 January 2016 at 14:22, Kai Wang <dep...@gmail.com> wrote:
>
>> James,
>>
>> Can you post the result of "nodetool netstats" on the bad node?
>>
>> On Thu, Jan 14, 2016 at 9:09 AM, James Griffin <
>> james.grif...@idioplatform.com> wrote:
>>
>>> A summary of what we've done this morning:
>>>
>>>- Noted that there are no GCInspector lines in system.log on bad
>>>node (there are GCInspector logs on other healthy nodes)
>>>- Turned on GC logging, noted that we had logs which stated out
>>>total time for which application threads were stopped was high - ~10s.
>>>- Not seeing failures or any kind (promotion or concurrent mark)
>>>- Attached Visual VM: noted that heap usage was very low (~5% usage
>>>and stable) and it didn't display hallmarks GC of activity. PermGen also
>>>very stable
>>>- Downloaded GC logs and examined in GC Viewer. Noted that:
>>>- We had lots of pauses (again around 10s), but no full GC.
>>>   - From a 2,300s sample, just over 2,000s were spent with threads
>>>   paused
>>>   - Spotted many small GCs in the new space - realised that Xmn
>>>   value was very low (200M against a heap size of 3750M). Increased Xmn 
>>> to
>>>   937M - no change in server behaviour (high load, high reads/s on 
>>> disk, high
>>>   CPU wait)
>>>
>>> Current output of jstat:
>>>
>>>   S0 S1 E  O  P YGC YGCTFGCFGCT GCT
>>> 2 0.00  45.20  12.82  26.84  76.21   2333   63.684 20.039
>>> 63.724
>>> 3 63.58   0.00  33.68   8.04  75.19 141.812 20.103
>>>  1.915
>>>
>>> Correct me if I'm wrong, but it seems 3 is lot more healthy GC wise than
>>> 2 (which has normal load statistics).
>>>
>>> Anywhere else you can recommend we look?
>>>
>>> Griff
>>>
>>> On 14 January 2016 at 01:25, Anuj Wadehra <anujw_2...@yahoo.co.in>
>>> wrote:
>>>
>>>> Ok. I saw dropped mutations on your cluster and full gc is a common
>>>> cause for that.
>>>> Can you just search the word GCInspector in system.log and share the
>>>> frequency of minor and full gc. Moreover, are you printing promotion
>>>> failures in gc logs?? Why full gc ia getting triggered??promotion failures
>>>> or concurrent mode failures?
>>>>
>>>> If you are on CMS, you need to fine tune your heap options to address
>>>> full gc.
>>>>
>>>>
>>>>
>>>> Thanks
>>>> Anuj
>>>>
>>>> Sent from Yahoo Mail on Android
>>>> <https://overview.mail.yahoo.com/mobile/?.src=Android>
>>>>
>>>> On Thu, 14 Jan, 2016 at 12:57 am, James Griffin
>>>> <james.grif...@idioplatform.com> wrote:
>>>&

[Off heap memory used (total)] in cfstats

2016-01-08 Thread Kai Wang

Hi,

When I switch a big table from STCS to LCS, I notice high off heap memory
usage using nodetool cfstats. "*Off heap memory used (total)*" shows +10G
usage.

Eventually my nodes died because of OOM. How do I throttle off heap usage?
The only thing I see in cassandra.yaml is *memtable_offheap_space_in_mb. *Is
that the only knob?

Thanks.

Re: [Off heap memory used (total)] in cfstats

2016-01-08 Thread Kai Wang

Update:

I check the source code:
https://github.com/apache/cassandra/blob/4a0d1caa262af3b6f2b6d329e45766b4df845a88/src/java/org/apache/cassandra/tools/nodetool/Info.java#L154

Off heap memory usage comes from four parts. I rechecked cfstats to find
out Bloom filter takes most of the off heap memory. I increased
bloom_filter_fp_chance from the default 0.01 to 0.1 and restarted nodes.
The memory usage comes down now.

My table is kinda special in that it has lots of partitions but each
partition is very small (maximum <1M). I guess for such a table, Bloom
filter can be very large compared to the actual data size.

On Fri, Jan 8, 2016 at 6:39 PM, Kai Wang <dep...@gmail.com> wrote:

> Hi,
>
> When I switch a big table from STCS to LCS, I notice high off heap memory
> usage using nodetool cfstats. "*Off heap memory used (total)*" shows +10G
> usage.
>
> Eventually my nodes died because of OOM. How do I throttle off heap usage?
> The only thing I see in cassandra.yaml is *memtable_offheap_space_in_mb. *Is
> that the only knob?
>
> Thanks.
>

confusion about migrating to incremental repair

2016-01-06 Thread Kai Wang

Hi,

I am running a cluster with 2.2.4. I have some table on LCS and plan to use
incremental repair. I read the post at
http://www.datastax.com/dev/blog/anticompaction-in-cassandra-2-1 and am a
little confused.

especially:
"This means that *once you do an incremental repair you will have to
continue doing them* (there are ways to clear out the repair-state to
revert this, more about that later). Otherwise you will not run leveled
compaction, just size tiered."

Does this mean once I start doing inc repair, I have to always run inc
repair to keep using LCS? What about full repair once a week or month? Is
that still recommended? Will it stop LCS?

Re: unable to create a user on version 2.2.4

2016-01-02 Thread Kai Wang

http://www.datastax.com/dev/blog/role-based-access-control-in-cassandra
On Jan 2, 2016 4:13 AM, "david"  wrote:

> Sam Tunnicliffe  beobal.com> writes:
>
> >
> >
> > If you've upgraded to 2.2.4, the full instructions necessary for
>  auth-enabled clusters were
> unfortunately missing from NEWS.txt. See CASSANDRA-10904 for details.
> > On 2 Jan 2016 10:05, "david"  gmail.com>
> wrote:we are running cassandra
> version 2.2.4 on Debian  jessie (latest stable) .
> > when i attempt to create user, it doesn't work when i type the following
> > 'create user alice with password 'bob' superuser;'
> > cqlsh returns fine without any error
> > however 'list users' does not show the newly created user
> > what could be the issue? pls. advise. thanks in advance.
>
> thank you, Sam!
> if there are no users table, how can we support authentication?
> pls. clarify. thanks in advance.
>
>

Is CQLSSTableWriter tied to C* version?

2015-12-22 Thread Kai Wang

Hi,

Can sstables created by CQLSSTableWriter in cassandra-all.jar 2.1.12 be
loaded into C* 2.2.4? Or they have to be on the same version?

Re: Is CQLSSTableWriter tied to C* version?

2015-12-22 Thread Kai Wang

Jon,

Thanks. So I will just update to use cassandra-all.jar 2.2.4 to be sure.

On Wed, Dec 23, 2015 at 12:10 AM, Jonathan Haddad <j...@jonhaddad.com> wrote:

> The streaking format is directly tied to the sstable format. So, in
> general, if the format changes between versions, you can't stream. I don't
> think the format changed between these 2 versions, but I'm typing this on
> my phone and can't verify.
>
> On Tue, Dec 22, 2015 at 6:36 PM Kai Wang <dep...@gmail.com> wrote:
>
>> Hi,
>>
>> Can sstables created by CQLSSTableWriter in cassandra-all.jar 2.1.12 be
>> loaded into C* 2.2.4? Or they have to be on the same version?
>>
>

Re: [Marketing Mail] Re: [Marketing Mail] can't make any permissions change in 2.2.4

2015-12-19 Thread Kai Wang

Some update. I went through this blog:

https://www.instaclustr.com/5-things-you-need-to-know-about-cassandra-2-2/

and deleted these three tables (with fingers crossed):

system_auth.credentials
system_auth.users
system_auth.permissions

Now I seem to be able to use RBAC to modify permissions.

On Fri, Dec 18, 2015 at 9:23 AM, Kai Wang <dep...@gmail.com> wrote:

> Sylvain,
>
> Thank you very much.
>
> On Fri, Dec 18, 2015 at 9:20 AM, Sylvain Lebresne <sylv...@datastax.com>
> wrote:
>
>> On Fri, Dec 18, 2015 at 3:04 PM, Kai Wang <dep...@gmail.com> wrote:
>>
>>> Reynald,
>>>
>>> Thanks for link. That explains it.
>>>
>>> Sylvain,
>>>
>>> What exactly are the "legacy tables" I am supposed to drop? Before I
>>> drop them, is there any way I can confirm the old schema has been converted
>>> to the new one successfully?
>>>
>>
>> I didn't worked on those changes so I'm actually not sure of the exact
>> answer. But I see you commented on the ticket so we'll make sure to include
>> that information in the NEWS file (and maybe to get the blog post edited).
>>
>>
>>>
>>> Thanks.
>>>
>>>
>>> On Fri, Dec 18, 2015 at 5:05 AM, Reynald Bourtembourg <
>>> reynald.bourtembo...@esrf.fr> wrote:
>>>
>>>> Done:
>>>> https://issues.apache.org/jira/browse/CASSANDRA-10904
>>>>
>>>>
>>>>
>>>> On 18/12/2015 10:51, Sylvain Lebresne wrote:
>>>>
>>>> On Fri, Dec 18, 2015 at 8:55 AM, Reynald Bourtembourg <
>>>> <reynald.bourtembo...@esrf.fr>reynald.bourtembo...@esrf.fr> wrote:
>>>>
>>>>> This does not seem to be explained in the Cassandra 2.2 Upgrading
>>>>> section of the NEWS.txt file:
>>>>>
>>>>>
>>>>> https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/cassandra-2.2.4
>>>>>
>>>>>
>>>> This is indeed an oversight. Would you mind opening a JIRA ticket so we
>>>> don't forget to add it now?
>>>>
>>>> --
>>>> Sylvain
>>>>
>>>>
>>>>
>>>
>>
>

Re: can't make any permissions change in 2.2.4

2015-12-19 Thread Kai Wang

Thank you, Sam.
On Dec 19, 2015 12:35 PM, "Sam Tunnicliffe" <s...@beobal.com> wrote:

> Sorry about the confusing omission of the upgrade instructions from
> NEWS.txt, the oversight there was mine in the course of CASSANDRA-7653.
> Dropping those tables is absolutely what you need to do in order to trigger
> C* to switch over to using the new role-based tables following an upgrade
> to 2.2+
>
> I've updated the DataStax blog post Reynald referred to earlier in the
> thread and I'll commit an update to NEWS.txt with these instructions on
> Monday.
>
> Thanks,
> Sam
>
> On Sat, Dec 19, 2015 at 5:00 PM, Kai Wang <dep...@gmail.com> wrote:
>
>> Some update. I went through this blog:
>>
>> https://www.instaclustr.com/5-things-you-need-to-know-about-cassandra-2-2/
>>
>> and deleted these three tables (with fingers crossed):
>>
>> system_auth.credentials
>> system_auth.users
>> system_auth.permissions
>>
>> Now I seem to be able to use RBAC to modify permissions.
>>
>> On Fri, Dec 18, 2015 at 9:23 AM, Kai Wang <dep...@gmail.com> wrote:
>>
>>> Sylvain,
>>>
>>> Thank you very much.
>>>
>>> On Fri, Dec 18, 2015 at 9:20 AM, Sylvain Lebresne <sylv...@datastax.com>
>>> wrote:
>>>
>>>> On Fri, Dec 18, 2015 at 3:04 PM, Kai Wang <dep...@gmail.com> wrote:
>>>>
>>>>> Reynald,
>>>>>
>>>>> Thanks for link. That explains it.
>>>>>
>>>>> Sylvain,
>>>>>
>>>>> What exactly are the "legacy tables" I am supposed to drop? Before I
>>>>> drop them, is there any way I can confirm the old schema has been 
>>>>> converted
>>>>> to the new one successfully?
>>>>>
>>>>
>>>> I didn't worked on those changes so I'm actually not sure of the exact
>>>> answer. But I see you commented on the ticket so we'll make sure to include
>>>> that information in the NEWS file (and maybe to get the blog post edited).
>>>>
>>>>
>>>>>
>>>>> Thanks.
>>>>>
>>>>>
>>>>> On Fri, Dec 18, 2015 at 5:05 AM, Reynald Bourtembourg <
>>>>> reynald.bourtembo...@esrf.fr> wrote:
>>>>>
>>>>>> Done:
>>>>>> https://issues.apache.org/jira/browse/CASSANDRA-10904
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 18/12/2015 10:51, Sylvain Lebresne wrote:
>>>>>>
>>>>>> On Fri, Dec 18, 2015 at 8:55 AM, Reynald Bourtembourg <
>>>>>> <reynald.bourtembo...@esrf.fr>reynald.bourtembo...@esrf.fr> wrote:
>>>>>>
>>>>>>> This does not seem to be explained in the Cassandra 2.2 Upgrading
>>>>>>> section of the NEWS.txt file:
>>>>>>>
>>>>>>>
>>>>>>> https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/cassandra-2.2.4
>>>>>>>
>>>>>>>
>>>>>> This is indeed an oversight. Would you mind opening a JIRA ticket so
>>>>>> we don't forget to add it now?
>>>>>>
>>>>>> --
>>>>>> Sylvain
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: [Marketing Mail] Re: [Marketing Mail] can't make any permissions change in 2.2.4

2015-12-18 Thread Kai Wang

Reynald,

Thanks for link. That explains it.

Sylvain,

What exactly are the "legacy tables" I am supposed to drop? Before I drop
them, is there any way I can confirm the old schema has been converted to
the new one successfully?

Thanks.


On Fri, Dec 18, 2015 at 5:05 AM, Reynald Bourtembourg <
reynald.bourtembo...@esrf.fr> wrote:

> Done:
> https://issues.apache.org/jira/browse/CASSANDRA-10904
>
>
>
> On 18/12/2015 10:51, Sylvain Lebresne wrote:
>
> On Fri, Dec 18, 2015 at 8:55 AM, Reynald Bourtembourg <
> reynald.bourtembo...@esrf.fr> wrote:
>
>> This does not seem to be explained in the Cassandra 2.2 Upgrading section
>> of the NEWS.txt file:
>>
>>
>> https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/cassandra-2.2.4
>>
>>
> This is indeed an oversight. Would you mind opening a JIRA ticket so we
> don't forget to add it now?
>
> --
> Sylvain
>
>
>

Re: [Marketing Mail] Re: [Marketing Mail] can't make any permissions change in 2.2.4

2015-12-18 Thread Kai Wang

Sylvain,

Thank you very much.

On Fri, Dec 18, 2015 at 9:20 AM, Sylvain Lebresne <sylv...@datastax.com>
wrote:

> On Fri, Dec 18, 2015 at 3:04 PM, Kai Wang <dep...@gmail.com> wrote:
>
>> Reynald,
>>
>> Thanks for link. That explains it.
>>
>> Sylvain,
>>
>> What exactly are the "legacy tables" I am supposed to drop? Before I drop
>> them, is there any way I can confirm the old schema has been converted to
>> the new one successfully?
>>
>
> I didn't worked on those changes so I'm actually not sure of the exact
> answer. But I see you commented on the ticket so we'll make sure to include
> that information in the NEWS file (and maybe to get the blog post edited).
>
>
>>
>> Thanks.
>>
>>
>> On Fri, Dec 18, 2015 at 5:05 AM, Reynald Bourtembourg <
>> reynald.bourtembo...@esrf.fr> wrote:
>>
>>> Done:
>>> https://issues.apache.org/jira/browse/CASSANDRA-10904
>>>
>>>
>>>
>>> On 18/12/2015 10:51, Sylvain Lebresne wrote:
>>>
>>> On Fri, Dec 18, 2015 at 8:55 AM, Reynald Bourtembourg <
>>> <reynald.bourtembo...@esrf.fr>reynald.bourtembo...@esrf.fr> wrote:
>>>
>>>> This does not seem to be explained in the Cassandra 2.2 Upgrading
>>>> section of the NEWS.txt file:
>>>>
>>>>
>>>> https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/cassandra-2.2.4
>>>>
>>>>
>>> This is indeed an oversight. Would you mind opening a JIRA ticket so we
>>> don't forget to add it now?
>>>
>>> --
>>> Sylvain
>>>
>>>
>>>
>>
>

Re: OpsCenter support Cassandra 3.0.x

2015-12-18 Thread Kai Wang

"*Note: Future versions of OpsCenter will support Cassandra versions 2.2
and 3.0.*"

http://docs.datastax.com/en/upgrade/doc/upgrade/opscenter/opscCompatibility.html

On Fri, Dec 18, 2015 at 8:16 AM, Cassandramail 
wrote:

> Hello,
>
> Do you know any plan to support Cassandra 3.0.x in OpsCenter by DataStax?
>
> Thanks
>

can't make any permissions change in 2.2.4

2015-12-17 Thread Kai Wang

I used to able to add/drop users and modify permissions in 2.1.1. After
upgrading to 2.2.4, I can't modify any of those. "List all permissions"
returns me all the permissions I setup before the upgrade. But I can't add
new permission or add new users in cqlsh. "create user" and "grant" didn't
report any errors.

Re: Replicating Data Between Separate Data Centres

2015-12-15 Thread Kai Wang

Philip,

I don't see the benefit to have a multi-DC C* cluster in this case. What
you need is two separate C* clusters and use Kafka record/replay writes to
DR. DR only receives writes from Kafka consumer. You won't need to deal
with "Removing everything from Cassandra that -isn't- in Kafka".

On Mon, Dec 14, 2015 at 7:13 PM, Philip Persad 
wrote:

> I did consider doubling down and replicating both Kafka and Cassandra to
> the secondary DC.  It seemed a bit complicated (term used relatively), and
> I didn't want to think about the unlikely scenario of Cassandra writes
> getting across before the Kafka ones.  Inserting everything in Kafka into
> Cassandra after a failure is easy.  Removing everything from Cassandra that
> -isn't- in Kafka is not a problem I want to take a swing at if I don't have
> to.
>
>
> On Mon, Dec 14, 2015 at 4:02 PM, Jeff Jirsa 
> wrote:
>
>> Emit a message to a new kafka topic once the first write is persisted
>> into cassandra with LOCAL_QUORUM (gives you low latency), then consume off
>> of that topic to get higher-latency-but-causally-correct writes to
>> subsequent (disconnected) DR DC.
>>
>>
>>
>> From: Philip Persad
>> Reply-To: "user@cassandra.apache.org"
>> Date: Monday, December 14, 2015 at 3:37 PM
>>
>> To: Cassandra Users
>> Subject: Re: Replicating Data Between Separate Data Centres
>>
>> Hi Jeff,
>>
>> You're dead on with that article.  That is a very good explanation of the
>> problem I'm facing.  You're also right that, fascinating though that
>> research is, letting it anywhere near my production data is not something
>> I'd think about.
>>
>> Basically, I want EACH_QUORUM, but I'm not willing to pay for it.  My
>> system needs to be reasonably close to a real-time system (let's say a soft
>> real-time system).  Waiting for each write to make its way across a
>> continent is not something I can live with (to say nothing about what
>> happens if the WAN temporarily fails).
>>
>> Basically I guess what I'm hearing is that the best way to create a clone
>> of a Cassandra cluster in another DC is to snapshot and restore.
>>
>> Thanks!
>>
>> -Phil
>>
>> On Mon, Dec 14, 2015 at 3:18 PM, Jeff Jirsa 
>> wrote:
>>
>>>
>>> There is research into causal consistency and cassandra (
>>> http://da-data.blogspot.com/2013/02/caring-about-causality-now-in-cassandra.html
>>>  ,
>>> for example), though you’ll note that it uses a fork (
>>> https://github.com/wlloyd/eiger ) which is unlikely something you’d
>>> ever want to consider in production. Let’s pretend like it doesn’t exist,
>>> and won’t in the near future.
>>>
>>> The typical approach here is to have multiple active datacenters and
>>> EACH_QUORUM writes, which gives you the ability to have a full DC failure
>>> without impact. This also solves your fail-back problem, because when the
>>> primary DC is restored, you simply run a repair. What part of EACH_QUORUM
>>> is insufficient for your needs? The failure scenarios when the WAN link
>>> breaks and it impacts local writes?
>>>
>>> Short of that, your ‘occasional snapshots and restore in case of
>>> emergency’ is going to be your next-best-thing.
>>>
>>>
>>> From: Philip Persad
>>> Reply-To: "user@cassandra.apache.org"
>>> Date: Monday, December 14, 2015 at 3:11 PM
>>> To: Cassandra Users
>>> Subject: Re: Replicating Data Between Separate Data Centres
>>>
>>> Hi Jim,
>>>
>>> Thanks for taking the time to answer.  By Causal Consistency, what I
>>> mean is that I need strict ordering of all related events which might have
>>> a causal relationship.  For example (albeit slightly contrived), if we are
>>> looking at recording an event stream, it is very important that the event
>>> creating a user be visible before the event which assigns a permissions to
>>> a user.  However, I don't care at all about the ordering of the creation of
>>> two different users.  This is what I mean by Causal Consistency.
>>>
>>> This reason why LOCAL_QUORUM replication does not work for me, is
>>> because, while I can get ordering guarantees about the order in which
>>> writes will become visible in the Primary DC, I cannot get those guarantees
>>> about the Secondary DC.  As a result (to user another slightly contrived
>>> example), if a user is created and then takes an action shortly before the
>>> failure of the Primary DC, there are four possible situations with respect
>>> to what will be visible in the Secondary DC:
>>>
>>> 1) Both events are visible in the Secondary DC
>>> 2) Neither event will be visible in the Secondary DC
>>> 3) The creation event is visible in the Secondary DC, but the action
>>> event is not
>>> 4) The action event is visible Secondary DC, but the creation event is
>>> not
>>>
>>> States 1, 2, and 3 are all acceptable.  State 4 is not.  However, if I
>>> understand Cassandra asynchronous DC replication correctly, I do not
>>> believe I get any guarantees that situation 4 will not

Re: [RELEASE] Apache Cassandra 3.1 released

2015-12-10 Thread Kai Wang

Josh,

Thank you very much for the clarification.

On Thu, Dec 10, 2015 at 11:13 AM, Josh McKenzie <jmcken...@apache.org>
wrote:

> Kai,
>
>
>> The most stable version will be 3.1 because it includes the critical
>> fixes in 3.0.1 and some additional bug fixes
>
> 3.0.1 and 3.1 are identical. This is a unique overlap specific to 3.0.1
> and 3.1.
>
> To summarize, the most stable version should be x.Max(2n+1).z.
>
> Going forward, you can expect the following:
> 3.2: new features
> 3.3: stabilization (built on top of 3.2)
> 3.4: new features
> 3.5: stabilization (built on top of 3.4)
>
> And in parallel (for the 3.x major version / transition to tick-tock
> transition period only):
> 3.0.2: bugfixes only
> 3.0.3: bugfixes only
> 3.0.4: bugfixes only
> etc
>
> *Any bugfix that goes into 3.0.X will be in the 3.X line, however not all
> bugfixes in 3.X will be in 3.0.X* (bugfixes for new features introduced
> in 3.2, 3.4, etc will obviously not be back-ported to 3.0.X).
>
> So, for the 3.x line:
>
>- If you absolutely must have the most stable version of C* and don't
>care at all about the new features introduced in even versions of 3.x, you
>want the 3.0.N release.
>- If you want access to the new features introduced in even release
>versions of 3.x (3.2, 3.4, 3.6), you'll want to run the latest odd version
>(3.3, 3.5, 3.7, etc) after the release containing the feature you want
>access to (so, if the feature's introduced in 3.4 and we haven't dropped
>3.5 yet, obviously you'd need to run 3.4).
>
>
> This is only going to be the case during the transition phase from old
> release cycles to tick-tock. We're targeting changes to CI and quality
> focus going forward to greatly increase the stability of the odd releases
> of major branches (3.1, 3.3, etc) so, for the 4.X releases, our
> recommendation would be to run the highest # odd release for greatest
> stability.
>
> Hope that helps clarify.
>
> On Thu, Dec 10, 2015 at 10:34 AM, Kai Wang <dep...@gmail.com> wrote:
>
>> Paulo,
>>
>> Thank you for the examples.
>>
>> So if I go to download page and see 3.0.1, 3.1 and 3.2. The most stable
>> version will be 3.1 because it includes the critical fixes in 3.0.1 and
>> some additional bug fixes while doesn't have any new features introduced in
>> 3.2. In that sense 3.0.1 becomes obsolete as soon as 3.1 comes out.
>>
>> To summarize, the most stable version should be x.Max(2n+1).z.
>>
>> Am I correct?
>>
>>
>> On Thu, Dec 10, 2015 at 6:22 AM, Paulo Motta <pauloricard...@gmail.com>
>> wrote:
>>
>>> > Will 3.2 contain the bugfixes that are in 3.0.2 as well?
>>>
>>> If the bugfix affects both 3.2 and 3.0.2, yes. Otherwise it will only go
>>> in the affected version.
>>>
>>> > Is 3.x.y just 3.0.x plus new stuff? Where most of the time y is 0,
>>> unless there's a really serious issue that needs fixing?
>>>
>>> You can't really compare 3.0.y with 3.x(.y) because they're two
>>> different versioning schemes.  To make it a bit clearer:
>>>
>>> Old model:
>>> * x.y.z, where:
>>>   * x.y represents the "major" version (eg: 2.1, 2.2)
>>>   * z represents the "minor" version (eg: 2.1.1, 2.2.2)
>>>
>>> New model:
>>> * a.b(.c), where:
>>>   * a represents the "major" version (3, 4, 5)
>>>   * b represents the "minor" version (3.1, 3.2, 4.1, etc), where:
>>> * if b is even, it' a tick release, meaning it can contain both
>>> bugfixes and new features.
>>> * if b is odd, it's a tock release, meaning it can only contain
>>> bugfixes.
>>>   * c is a "subminor" optional version, which will only happen in
>>> emergency situations, for example, if a critical/blocker bug is discovered
>>> before the next release is out. So we probably won't have a 3.1.1, unless a
>>> critical bug is discovered in 3.1 and needs urgent fix before 3.2.
>>>
>>> The 3.0.x series is an interim stabilization release using the old
>>> versioning scheme, and will only receive bug fixes that affects it.
>>>
>>> 2015-12-09 18:21 GMT-08:00 Maciek Sakrejda <mac...@heroku.com>:
>>>
>>>> I'm still confused, even after reading the blog post twice (and reading
>>>> the linked Intel post). I understand what you are doing conceptually, but
>>>> I'm having a hard time mapping that to actual planned release numbers.
>>>>
>>>> > The 3.0.2 will only contain bugfixes, while 3.2 will introduce new
>>>> features.
>>>>
>>>>
>>>>
>>>
>>
>

Re: [RELEASE] Apache Cassandra 3.1 released

2015-12-10 Thread Kai Wang

Paulo,

Thank you for the examples.

So if I go to download page and see 3.0.1, 3.1 and 3.2. The most stable
version will be 3.1 because it includes the critical fixes in 3.0.1 and
some additional bug fixes while doesn't have any new features introduced in
3.2. In that sense 3.0.1 becomes obsolete as soon as 3.1 comes out.

To summarize, the most stable version should be x.Max(2n+1).z.

Am I correct?


On Thu, Dec 10, 2015 at 6:22 AM, Paulo Motta 
wrote:

> > Will 3.2 contain the bugfixes that are in 3.0.2 as well?
>
> If the bugfix affects both 3.2 and 3.0.2, yes. Otherwise it will only go
> in the affected version.
>
> > Is 3.x.y just 3.0.x plus new stuff? Where most of the time y is 0,
> unless there's a really serious issue that needs fixing?
>
> You can't really compare 3.0.y with 3.x(.y) because they're two different
> versioning schemes.  To make it a bit clearer:
>
> Old model:
> * x.y.z, where:
>   * x.y represents the "major" version (eg: 2.1, 2.2)
>   * z represents the "minor" version (eg: 2.1.1, 2.2.2)
>
> New model:
> * a.b(.c), where:
>   * a represents the "major" version (3, 4, 5)
>   * b represents the "minor" version (3.1, 3.2, 4.1, etc), where:
> * if b is even, it' a tick release, meaning it can contain both
> bugfixes and new features.
> * if b is odd, it's a tock release, meaning it can only contain
> bugfixes.
>   * c is a "subminor" optional version, which will only happen in
> emergency situations, for example, if a critical/blocker bug is discovered
> before the next release is out. So we probably won't have a 3.1.1, unless a
> critical bug is discovered in 3.1 and needs urgent fix before 3.2.
>
> The 3.0.x series is an interim stabilization release using the old
> versioning scheme, and will only receive bug fixes that affects it.
>
> 2015-12-09 18:21 GMT-08:00 Maciek Sakrejda :
>
>> I'm still confused, even after reading the blog post twice (and reading
>> the linked Intel post). I understand what you are doing conceptually, but
>> I'm having a hard time mapping that to actual planned release numbers.
>>
>> > The 3.0.2 will only contain bugfixes, while 3.2 will introduce new
>> features.
>>
>>
>>
>

Re: [RELEASE] Apache Cassandra 3.1 released

2015-12-09 Thread Kai Wang

Janne,

You are not alone. I am also confused by that "Under normal conditions ..."
statement. I can really use some examples such as:
3.0.0 = ?
3.0.1 = ?
3.1.0 = ?
3.1.1 = ? (this should not happen under normal conditions because the fix
should be in 3.3.0 - the next bug fix release?)

On Wed, Dec 9, 2015 at 3:05 AM, Janne Jalkanen 
wrote:

>
> I’m sorry, I don’t understand the new release scheme at all. Both of these
> are bug fixes on 3.0? What’s the actual difference?
>
> If I just want to run the most stable 3.0, should I run 3.0.1 or 3.1?
> Will 3.0 gain new features which will not go into 3.1, because that’s a bug
> fix release on 3.0? So 3.0.x will contain more features than 3.1, as
> even-numbered releases will be getting new features? Or is 3.0.1 and 3.1
> essentially the same thing? Then what’s the role of 3.1? Will there be more
> than one 3.1? 3.1.1? Or is it 3.3? What’s the content of that? 3.something
> + patches = 3.what?
>
> What does this statement in the referred blog post mean? "Under normal
> conditions, we will NOT release 3.x.y stability releases for x > 0.” Why
> are the normal conditions being violated already by releasing 3.1 (since 1
> > 0)?
>
> /Janne, who is completely confused by all this, and suspects he’s the
> target of some hideous joke.
>
> On 8 Dec 2015, at 22:26, Jake Luciani  wrote:
>
>
> The Cassandra team is pleased to announce the release of Apache Cassandra
> version 3.1. This is the first release from our new Tick-Tock release
> process[4].
> It contains only bugfixes on the 3.0 release.
>
> Apache Cassandra is a fully distributed database. It is the right choice
> when you need scalability and high availability without compromising
> performance.
>
>  http://cassandra.apache.org/
>
> Downloads of source and binary distributions are listed in our download
> section:
>
>  http://cassandra.apache.org/download/
>
> This version is a bug fix release[1] on the 3.x series. As always, please
> pay
> attention to the release notes[2] and Let us know[3] if you were to
> encounter
> any problem.
>
> Enjoy!
>
> [1]: http://goo.gl/rQJ9yd (CHANGES.txt)
> [2]: http://goo.gl/WBrlCs (NEWS.txt)
> [3]: https://issues.apache.org/jira/browse/CASSANDRA
> [4]: http://www.planetcassandra.org/blog/cassandra-2-2-3-0-and-beyond/
>
>
>

Re: Cassandra compaction stuck? Should I disable?

2015-12-07 Thread Kai Wang

Thank you for the investigation.
On Dec 2, 2015 5:21 AM, "PenguinWhispererThe ." <
th3penguinwhispe...@gmail.com> wrote:

> So it seems I found the problem.
>
> The node opening a stream is waiting for the other node to respond but
> that node never responds due to a broken pipe which makes Cassandra wait
> forever.
>
> It's basically this issue:
> https://issues.apache.org/jira/browse/CASSANDRA-8472
> And this is the workaround/fix:
> https://issues.apache.org/jira/browse/CASSANDRA-8611
>
> So:
> - update cassandra to >=2.0.11
> - add option streaming_socket_timeout_in_ms = 1
> - do rolling restart of cassandra
>
> What's weird is that the IOException: Broken pipe is never shown in my
> logs (not on any node). And my logging is set to INFO in log4j config.
> I have this config in log4j-server.properties:
> # output messages into a rolling log file as well as stdout
> log4j.rootLogger=INFO,stdout,R
>
> # stdout
> log4j.appender.stdout=org.apache.log4j.ConsoleAppender
> log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
> log4j.appender.stdout.layout.ConversionPattern=%5p %d{HH:mm:ss,SSS} %m%n
>
> # rolling log file
> log4j.appender.R=org.apache.log4j.RollingFileAppender
> log4j.appender.R.maxFileSize=20MB
> log4j.appender.R.maxBackupIndex=50
> log4j.appender.R.layout=org.apache.log4j.PatternLayout
> log4j.appender.R.layout.ConversionPattern=%5p [%t] %d{ISO8601} %F (line
> %L) %m%n
> # Edit the next line to point to your logs directory
> log4j.appender.R.File=/var/log/cassandra/system.log
>
> # Application logging options
> #log4j.logger.org.apache.cassandra=DEBUG
> #log4j.logger.org.apache.cassandra.db=DEBUG
> #log4j.logger.org.apache.cassandra.service.StorageProxy=DEBUG
>
> # Adding this to avoid thrift logging disconnect errors.
> log4j.logger.org.apache.thrift.server.TNonblockingServer=ERROR
>
> Too bad nobody else could point to those. Hope it helps someone else from
> wasting a lot of time.
>
> 2015-11-11 15:42 GMT+01:00 Sebastian Estevez <
> sebastian.este...@datastax.com>:
>
>> Use 'nodetool compactionhistory'
>>
>> all the best,
>>
>> Sebastián
>> On Nov 11, 2015 3:23 AM, "PenguinWhispererThe ." <
>> th3penguinwhispe...@gmail.com> wrote:
>>
>>> Does compactionstats shows only stats for completed compactions (100%)?
>>> It might be that the compaction is running constantly, over and over again.
>>> In that case I need to know what I might be able to do to stop this
>>> constant compaction so I can start a nodetool repair.
>>>
>>> Note that there is a lot of traffic on this columnfamily so I'm not sure
>>> if temporary disabling compaction is an option. The repair will probably
>>> take long as well.
>>>
>>> Sebastian and Rob: do you might have any more ideas about the things I
>>> put in this thread? Any help is appreciated!
>>>
>>> 2015-11-10 20:03 GMT+01:00 PenguinWhispererThe . <
>>> th3penguinwhispe...@gmail.com>:
>>>
 Hi Sebastian,

 Thanks for your response.

 No swap is used. No offense, I just don't see a reason why having swap
 would be the issue here. I put swapiness on 1. I also have jna installed.
 That should prevent java being swapped out as wel AFAIK.


 2015-11-10 19:50 GMT+01:00 Sebastian Estevez <
 sebastian.este...@datastax.com>:

> Turn off Swap.
>
>
> http://docs.datastax.com/en/cassandra/2.1/cassandra/install/installRecommendSettings.html?scroll=reference_ds_sxl_gf3_2k__disable-swap
>
>
> All the best,
>
>
> [image: datastax_logo.png] 
>
> Sebastián Estévez
>
> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
>
> [image: linkedin.png]  [image:
> facebook.png]  [image: twitter.png]
>  [image: g+.png]
> 
> 
> 
>
>
> 
>
> DataStax is the fastest, most scalable distributed database
> technology, delivering Apache Cassandra to the world’s most innovative
> enterprises. Datastax is built to be agile, always-on, and predictably
> scalable to any size. With more than 500 customers in 45 countries, 
> DataStax
> is the database technology and transactional backbone of choice for the
> worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>
> On Tue, Nov 10, 2015 at 1:48 PM, PenguinWhispererThe . <
> th3penguinwhispe...@gmail.com> wrote:
>
>> I also have the following memory usage:
>> [root@US-BILLINGDSX4 cassandra]# free -m
>>  total   used   free sharedbuffers
>> cached
>> Mem: 12024   9455   2569  0110
>> 2163
>> -/+ buffers/cache:   7180   4844
>> Swap:

lots of tombstone after compaction

2015-12-07 Thread Kai Wang

I bulkloaded a few tables using CQLSStableWrite/sstableloader. The data are
large amount of wide rows with lots of null's. It takes one day or two for
the compaction to complete. sstable count is at single digit. Maximum
partition size is ~50M and mean size is ~5M. However I am seeing frequent
read query timeouts caused by tombstone_failure_threshold (10). These
tables are basically read-only. There're no writes.

I just kicked off compaction on those tables using nodetool. Hopefully it
can remove those tombstones. But is it normal to have these many tombstones
after the initial compactions? Is this related to the fact the original
data has lots of nulls?

Thanks.

Re: lots of tombstone after compaction

2015-12-07 Thread Kai Wang

Rob and Jeff,

Thank you. It makes sense. I am on 2.1.10 and will upgrade to 2.1.12.
On Dec 7, 2015 7:05 PM, "Jeff Jirsa" <jeff.ji...@crowdstrike.com> wrote:

> https://issues.apache.org/jira/browse/CASSANDRA-7953
>
> https://issues.apache.org/jira/browse/CASSANDRA-10505
>
> There are buggy versions of cassandra that will multiple tombstones during
> compaction. 2.1.12 SHOULD correct that, if you’re on 2.1.
>
>
>
> From: Kai Wang
> Reply-To: "user@cassandra.apache.org"
> Date: Monday, December 7, 2015 at 3:46 PM
> To: "user@cassandra.apache.org"
> Subject: lots of tombstone after compaction
>
> I bulkloaded a few tables using CQLSStableWrite/sstableloader. The data
> are large amount of wide rows with lots of null's. It takes one day or two
> for the compaction to complete. sstable count is at single digit. Maximum
> partition size is ~50M and mean size is ~5M. However I am seeing frequent
> read query timeouts caused by tombstone_failure_threshold (10). These
> tables are basically read-only. There're no writes.
>
> I just kicked off compaction on those tables using nodetool. Hopefully it
> can remove those tombstones. But is it normal to have these many tombstones
> after the initial compactions? Is this related to the fact the original
> data has lots of nulls?
>
> Thanks.
>

SELECT some_column vs SELECT *

2015-11-24 Thread Kai Wang

Hi all,

If I have the following table:
CREATE TABLE t (
  pk int,
  ck int,
  c1 int,
  c2 int,
  ...
  PRIMARY KEY (pk, ck)
)

There are lots of non-clustering columns (1000+). From time to time I need
to do a query like this:

SELECT c1 FROM t WHERE pk = abc AND ck > xyz;

How efficient is this query compared to SELECT * ...? Apparently SELECT c1
would save a lot of network bandwidth since only c1 needs to be transferred
on the wire. But I am more interested in the impact on disk IO. If I
understand C* storage engine correctly, one CQL row is clustered together
on disk. That means c1 from different rows are stored apart. In the case of
SELECT c1, does C* do multiple seeks to only lift c1 of each row from disk
or lift the whole row into memory and return c1 from there?

>From comments on https://issues.apache.org/jira/browse/CASSANDRA-5762 it
seems C* lifts the whole row as of 1.2.7. Is this still the case on 2.1.*?

Thanks.

Re: too many full gc in one node of the cluster

2015-11-13 Thread Kai Wang

What's the size of young generation (-Xmn) ?

On Fri, Nov 13, 2015 at 6:38 AM, Jason Wee  wrote:

> Used to manage/develop for cassandra 1.0.8 for quite sometime. Although
> 1.0 was rocking stable but we encountered various problems as load per node
> grow beyond 500gb. upgrading is one of the solution but may not be the
> solution for you but I strongly recommend you upgrade to 1.1 or 1.2. we
> upgraded the java on the cassandra node and cassandra to 1.1 and a lot of
> problems went away.
>
> As for your use cases, a quick solution would probably to just add nodes,
> or study client reading pattern so not on a node hot row (the has on the
> key), or the client configuration on your application and/or the keyspace
> replication.
>
> hth,
>
> jason
>
> On Fri, Nov 13, 2015 at 2:35 PM, Shuo Chen  wrote:
>
>> Hi,
>>
>> We have a small cassandra cluster with 4 nodes for production. All the
>> nodes have similar hardware configuration and similar data load. The C*
>> version is 1.0.7 (prretty old)
>>
>> One of the node has much higher cpu usage than others and high full gc
>> frequency, but the io of this node is not high and data load of this node
>> is even lower. So I have several questions:
>>
>> 1. Is that normal that one of the node having much higher full gc with
>> same jvm configuration?
>> 2. Does this node need special gc tuning and how?
>> 3. How to find the cause of the full gc?
>>
>> Thank you guys!
>>
>>
>> The heap size is 8G and max heap size is 16G. The gc config of
>> cassandra-env.sh is default:
>>
>> JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
>> JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
>> JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
>> JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=8"
>> JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=1"
>> JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75"
>> JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly"
>>
>> -
>> I print instance in the gc log:
>>
>> num #instances #bytes  class name
>> --
>>1:   2982796  238731200  [B
>>2:   3889672  186704256  java.nio.HeapByteBuffer
>>3:   1749589   55986848  org.apache.cassandra.db.Column
>>4:   1803900   43293600
>>  java.util.concurrent.ConcurrentSkipListMap$Node
>>5:859496   20627904
>>  java.util.concurrent.ConcurrentSkipListMap$Index
>>6:  5568   18827912  [J
>>7:1626306505200  java.math.BigInteger
>>8:1675725716976  [I
>>9:1416984534336
>>  java.util.concurrent.ConcurrentHashMap$HashEntry
>>   10:1415054528160
>>  com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node
>>   11: 314914376976  
>>   12: 314914291992  
>>   13:1716954120680  org.apache.cassandra.db.DecoratedKey
>>   14:  31573436120  
>>   15:1417843402816  java.lang.Long
>>   16:1416243398976  org.apache.cassandra.utils.Pair
>>   17:1415053396120
>>  com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$WeightedValue
>>   18: 496042675352  
>>   19:1622542596064
>>  org.apache.cassandra.dht.BigIntegerToken
>> ..
>> Total  13337798  641834360
>>
>>
>> --
>> The gc part and thread status of system log:
>>
>> INFO [ScheduledTasks:1] 2015-11-13 14:22:08,681 GCInspector.java (line
>> 123) GC for ParNew: 1015 ms for 2 collections, 3886753520 used; max is
>> 8231321600
>>  INFO [ScheduledTasks:1] 2015-11-13 14:22:09,683 GCInspector.java (line
>> 123) GC for ParNew: 500 ms for 1 collections, 4956287408 used; max is
>> 8231321600
>>  INFO [ScheduledTasks:1] 2015-11-13 14:22:10,685 GCInspector.java (line
>> 123) GC for ParNew: 627 ms for 1 collections, 5615882296used; max is
>> 8231321600
>>  INFO [ScheduledTasks:1] 2015-11-13 14:22:12,015 GCInspector.java (line
>> 123) GC for ParNew: 988 ms for 2 collections, 4943363480 used; max is
>> 8231321600
>>  INFO [ScheduledTasks:1] 2015-11-13 14:22:13,016 GCInspector.java (line
>> 123) GC for ParNew: 373 ms for 1 collections, 5978572832 used; max is
>> 8231321600
>>  INFO [ScheduledTasks:1] 2015-11-13 14:22:14,020 GCInspector.java (line
>> 123) GC for ParNew: 486 ms for 1 collections, 6209638280used; max is
>> 8231321600
>>  INFO [ScheduledTasks:1] 2015-11-13 14:22:15,412 GCInspector.java (line
>> 123) GC for ParNew: 898 ms for 2 collections, 6045603728used; max is
>> 8231321600
>>  INFO [ScheduledTasks:1] 2015-11-13 14:22:16,413 GCInspector.java (line
>> 123) GC for ParNew: 503 ms for 1 collections, 6991263984 used; max is
>> 8231321600
>>  INFO [ScheduledTasks:1] 2015-11-13 14:22:17,416 GCInspector.java (line
>> 123) GC for ParNew: 746 ms for 1 collections, 7073467384used; max is
>> 8231321600
>>  INFO [ScheduledTasks:1] 2015-11-13

Re: How to organize a timeseries by device?

2015-11-09 Thread Kai Wang

1. Don't make your partition unbound. It's tempting to just use (device_id,
timestamp). But soon or later you will have problem when time goes by. You
can keep the partition bound by using (device_id, bucket, timestamp). Use
hour, day, month or even year like Jack mentioned depending on the size of
data.

2. As to your specific query, for a given partition and a time range, C*
doesn't need to load the whole partition then filter. It only retrieves the
slice within the time range from disk because the data is clustered by
timestamp.

On Mon, Nov 9, 2015 at 8:13 AM, Jack Krupansky 
wrote:

> The general rule in Cassandra data modeling is to look at all of your
> queries first and then to declare a table for each query, even if that
> means storing multiple copies of the data. So, create a second table with
> bucketed time as the partition key (hour, 15 minutes, or whatever time
> interval makes sense to give 1 to 10 megabytes per partition) and time and
> device as the clustering keys.
>
> Or, consider DSE SEarch  and then you can do whatever ad hoc queries you
> want using Solr. Or Stratio or TupleJump Stargate for an open source Lucene
> plugin.
>
> -- Jack Krupansky
>
> On Mon, Nov 9, 2015 at 8:05 AM, Guillaume Charhon <
> guilla...@databerries.com> wrote:
>
>> Hello,
>>
>> We are currently storing geolocation events (about 1 per 5 minutes) for
>> each device we track. We currently have 2 TB of data. I would like to store
>> the device_id, the timestamp of the event, latitude and longitude. I though
>> about using the device_id as the partition key and timestamp as the
>> clustering column. It is great as events are naturally grouped by device
>> (very useful for our Spark jobs). However, if I would like to retrieve all
>> events of all devices of the last week I understood that Cassandra will
>> need to load all data and filter which does not seems to be clean on the
>> long term.
>>
>> How should I create my model?
>>
>> Best Regards
>>
>
>

Re: Can't save Opscenter Dashboard

2015-11-09 Thread Kai Wang

Finally I got this one resolved. I sent a feedback via Help->Feedback on
OpsCenter page. Someone is actually reading those - imagine that. Big +1 to
Datastax. Here is the fix:

first visit this URL: http://your_ip:your_port
/Test_Cluster/rc/dashboard_presets/
you should get a response like this:
{"838ef1a3-9d49-41ff-84e3-4d96440487e5": {}}
Then visit another URL:
curl -X "DELETE" http://http://your_ip:your_port
/Test_Cluster/rc/dashboard_presets/838ef1a3-9d49-41ff-84e3-4d96440487e5

This will clear out the broken dashboard settings and allow you to
reconfigure the dashboard again.

On Thu, Nov 5, 2015 at 10:02 AM, Kai Wang <dep...@gmail.com> wrote:

> It happens again after I reboot another node. This time I see errors in
> agent.log. It seems to be related to the previous dead node.
>
>   INFO [clojure-agent-send-off-pool-2] 2015-11-05 09:48:41,602 Attempting
> to load stored metric values.
>  ERROR [clojure-agent-send-off-pool-2] 2015-11-05 09:48:41,613 There was
> an error when attempting to load stored rollups.
>  com.datastax.driver.core.exceptions.DriverInternalError: Unexpected error
> while processing response from /x.x.x.x:9042
> at
> com.datastax.driver.core.exceptions.DriverInternalError.copy(DriverInternalError.java:42)
> at
> com.datastax.driver.core.exceptions.DriverInternalError.copy(DriverInternalError.java:24)
> ...
> Caused by: com.datastax.driver.core.exceptions.DriverInternalError:
> Unexpected error while processing response from /x.x.x.x:9042
> at
> com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:150)
> at
> com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:183)
> at
> com.datastax.driver.core.RequestHandler.access$2300(RequestHandler.java:45)
> ...
> Caused by: java.lang.IllegalStateException: Can't use this cluster
> instance because it was previously closed
> at com.datastax.driver.core.Cluster.checkNotClosed(Cluster.java:493)
> at com.datastax.driver.core.Cluster.access$400(Cluster.java:61)
> at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1231)
> ...
> INFO [clojure-agent-send-off-pool-1] 2015-11-05 09:48:41,618 Attempting to
> load stored metric values.
>  ERROR [clojure-agent-send-off-pool-1] 2015-11-05 09:48:41,622 There was
> an error when attempting to load stored rollups.
>  com.datastax.driver.core.exceptions.InvalidQueryException: Invalid null
> value for partition key part key
> at
> com.datastax.driver.core.exceptions.InvalidQueryException.copy(InvalidQueryException.java:35)
> at
> com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:291)
>
>
> On Wed, Nov 4, 2015 at 8:43 PM, qihuang.zheng <
> qihuang.zh...@fraudmetrix.cn> wrote:
>
>> We have this problem with version 5.2.0.  so we decide to update to 5.2.2
>>
>> But this problem seems remain.  We solve this by totally delele relate
>> agent file and process and restart. just like first time install.
>>
>>
>> sudo kill -9 `ps -ef|grep datastax_agent_monitor | head -1 |awk '{print
>> $2}'` && \
>>
>> sudo kill -9 `cat /var/run/datastax-agent/datastax-agent.pid` && \
>>
>> sudo rm -rf /var/lib/datastax-agent && \
>>
>> sudo rm -rf /usr/share/datastax-agent
>>
>> --
>> qihuang.zheng
>>
>>  原始邮件
>> *发件人:* Kai Wang<dep...@gmail.com>
>> *收件人:* user<user@cassandra.apache.org>
>> *发送时间:* 2015年11月5日(周四) 04:39
>> *主题:* Can't save Opscenter Dashboard
>>
>> Hi,
>>
>> Today after one of the nodes is rebooted, OpsCenter dashboard doesn't
>> save anymore. It starts with an empty dashboard with no widget or graph. If
>> I add some graph/widget, they are being updated fine. But if I refresh the
>> browser, the dashboard became empty again.
>>
>> Also there's no "DEFAULT" tab on the dashboard as the user guide shows. I
>> am not sure if it was there before.
>>
>
>

Re: How to organize a timeseries by device?

2015-11-09 Thread Kai Wang

bucket key is just like any column of the table, you can use any type as
long as it's convenient for you to write the query.

But I don't think you should use 5 minute as your bucket key since you only
have 1 event every 5 minute. 5-minute bucket seems too small. The bucket
key we mentioned is for you to break the (device_id, timestamp) partitions
into ones with size between ~1MB to ~10MB.

On Mon, Nov 9, 2015 at 11:50 AM, Guillaume Charhon <
guilla...@databerries.com> wrote:

> Is it usually recommended to use the bucket key (usually an 5 minutes
> period in my case) for the table of the events_by_time using a timestamp or
> a string?
>
> On Mon, Nov 9, 2015 at 5:05 PM, Kai Wang <dep...@gmail.com> wrote:
>
>> it depends on the size of each event. You want to bound each partition
>> under ~10MB. In system.log look for entry like:
>>
>> WARN  [CompactionExecutor:39] 2015-11-07 17:32:00,019
>> SSTableWriter.java:240 - Compacting large partition
>> :9f80ce31-b7e7-40c7-b642-f5d03fc320aa (13443863224 bytes)
>>
>> This is the warning sign that you have large partitions. The threshold is
>> defined by compaction_large_partition_warning_threshold_mb in
>> cassandra.yaml. The default is 100MB.
>>
>> You can also use nodetool cfstats to check partition size.
>>
>> On Mon, Nov 9, 2015 at 10:53 AM, Guillaume Charhon <
>> guilla...@databerries.com> wrote:
>>
>>> For the first table: (device_id, timestamp), should I add a bucket even
>>> if I know I might have millions of events per device but never billions?
>>>
>>> On Mon, Nov 9, 2015 at 4:37 PM, Jack Krupansky <jack.krupan...@gmail.com
>>> > wrote:
>>>
>>>> Cassandra is good at two kinds of queries: 1) access a specific row by
>>>> a specific key, and 2) Access a slice or consecutive sequence of rows
>>>> within a given partition.
>>>>
>>>> It is recommended to avoid ALLOW FILTERING. If it happens to work well
>>>> for you, great, go for it, but if it doesn't then simply don't do it. Best
>>>> to redesign your data model to play to Cassandra's strengths.
>>>>
>>>> If you bucket the time-based table, do a separate query for each time
>>>> bucket.
>>>>
>>>> -- Jack Krupansky
>>>>
>>>> On Mon, Nov 9, 2015 at 10:16 AM, Guillaume Charhon <
>>>> guilla...@databerries.com> wrote:
>>>>
>>>>> Kai, Jack,
>>>>>
>>>>> On 1., should the bucket be a STRING with a date format or do I have a
>>>>> better option ? For (device_id, bucket, timestamp), did you mean
>>>>> ((device_id, bucket), timestamp) ?
>>>>>
>>>>> On 2., what are the risks of timeout ? I currently have this warning:
>>>>> "Cannot execute this query as it might involve data filtering and thus may
>>>>> have unpredictable performance. If you want to execute this query despite
>>>>> the performance unpredictability, use ALLOW FILTERING".
>>>>>
>>>>> On Mon, Nov 9, 2015 at 3:02 PM, Kai Wang <dep...@gmail.com> wrote:
>>>>>
>>>>>> 1. Don't make your partition unbound. It's tempting to just use
>>>>>> (device_id, timestamp). But soon or later you will have problem when time
>>>>>> goes by. You can keep the partition bound by using (device_id, bucket,
>>>>>> timestamp). Use hour, day, month or even year like Jack mentioned 
>>>>>> depending
>>>>>> on the size of data.
>>>>>>
>>>>>> 2. As to your specific query, for a given partition and a time range,
>>>>>> C* doesn't need to load the whole partition then filter. It only 
>>>>>> retrieves
>>>>>> the slice within the time range from disk because the data is clustered 
>>>>>> by
>>>>>> timestamp.
>>>>>>
>>>>>> On Mon, Nov 9, 2015 at 8:13 AM, Jack Krupansky <
>>>>>> jack.krupan...@gmail.com> wrote:
>>>>>>
>>>>>>> The general rule in Cassandra data modeling is to look at all of
>>>>>>> your queries first and then to declare a table for each query, even if 
>>>>>>> that
>>>>>>> means storing multiple copies of the data. So, create a second table 
>>>>>>> with
>>>>>>> bucketed time as the partition key (hour, 15 minutes, or whatever time
>>>>&

Re: How to organize a timeseries by device?

2015-11-09 Thread Kai Wang

it depends on the size of each event. You want to bound each partition
under ~10MB. In system.log look for entry like:

WARN  [CompactionExecutor:39] 2015-11-07 17:32:00,019
SSTableWriter.java:240 - Compacting large partition
:9f80ce31-b7e7-40c7-b642-f5d03fc320aa (13443863224 bytes)

This is the warning sign that you have large partitions. The threshold is
defined by compaction_large_partition_warning_threshold_mb in
cassandra.yaml. The default is 100MB.

You can also use nodetool cfstats to check partition size.

On Mon, Nov 9, 2015 at 10:53 AM, Guillaume Charhon <
guilla...@databerries.com> wrote:

> For the first table: (device_id, timestamp), should I add a bucket even
> if I know I might have millions of events per device but never billions?
>
> On Mon, Nov 9, 2015 at 4:37 PM, Jack Krupansky <jack.krupan...@gmail.com>
> wrote:
>
>> Cassandra is good at two kinds of queries: 1) access a specific row by a
>> specific key, and 2) Access a slice or consecutive sequence of rows within
>> a given partition.
>>
>> It is recommended to avoid ALLOW FILTERING. If it happens to work well
>> for you, great, go for it, but if it doesn't then simply don't do it. Best
>> to redesign your data model to play to Cassandra's strengths.
>>
>> If you bucket the time-based table, do a separate query for each time
>> bucket.
>>
>> -- Jack Krupansky
>>
>> On Mon, Nov 9, 2015 at 10:16 AM, Guillaume Charhon <
>> guilla...@databerries.com> wrote:
>>
>>> Kai, Jack,
>>>
>>> On 1., should the bucket be a STRING with a date format or do I have a
>>> better option ? For (device_id, bucket, timestamp), did you mean
>>> ((device_id, bucket), timestamp) ?
>>>
>>> On 2., what are the risks of timeout ? I currently have this warning:
>>> "Cannot execute this query as it might involve data filtering and thus may
>>> have unpredictable performance. If you want to execute this query despite
>>> the performance unpredictability, use ALLOW FILTERING".
>>>
>>> On Mon, Nov 9, 2015 at 3:02 PM, Kai Wang <dep...@gmail.com> wrote:
>>>
>>>> 1. Don't make your partition unbound. It's tempting to just use
>>>> (device_id, timestamp). But soon or later you will have problem when time
>>>> goes by. You can keep the partition bound by using (device_id, bucket,
>>>> timestamp). Use hour, day, month or even year like Jack mentioned depending
>>>> on the size of data.
>>>>
>>>> 2. As to your specific query, for a given partition and a time range,
>>>> C* doesn't need to load the whole partition then filter. It only retrieves
>>>> the slice within the time range from disk because the data is clustered by
>>>> timestamp.
>>>>
>>>> On Mon, Nov 9, 2015 at 8:13 AM, Jack Krupansky <
>>>> jack.krupan...@gmail.com> wrote:
>>>>
>>>>> The general rule in Cassandra data modeling is to look at all of your
>>>>> queries first and then to declare a table for each query, even if that
>>>>> means storing multiple copies of the data. So, create a second table with
>>>>> bucketed time as the partition key (hour, 15 minutes, or whatever time
>>>>> interval makes sense to give 1 to 10 megabytes per partition) and time and
>>>>> device as the clustering keys.
>>>>>
>>>>> Or, consider DSE SEarch  and then you can do whatever ad hoc queries
>>>>> you want using Solr. Or Stratio or TupleJump Stargate for an open source
>>>>> Lucene plugin.
>>>>>
>>>>> -- Jack Krupansky
>>>>>
>>>>> On Mon, Nov 9, 2015 at 8:05 AM, Guillaume Charhon <
>>>>> guilla...@databerries.com> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> We are currently storing geolocation events (about 1 per 5 minutes)
>>>>>> for each device we track. We currently have 2 TB of data. I would like to
>>>>>> store the device_id, the timestamp of the event, latitude and longitude. 
>>>>>> I
>>>>>> though about using the device_id as the partition key and timestamp as 
>>>>>> the
>>>>>> clustering column. It is great as events are naturally grouped by device
>>>>>> (very useful for our Spark jobs). However, if I would like to retrieve 
>>>>>> all
>>>>>> events of all devices of the last week I understood that Cassandra will
>>>>>> need to load all data and filter which does not seems to be clean on the
>>>>>> long term.
>>>>>>
>>>>>> How should I create my model?
>>>>>>
>>>>>> Best Regards
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Can't save Opscenter Dashboard

2015-11-05 Thread Kai Wang

It happens again after I reboot another node. This time I see errors in
agent.log. It seems to be related to the previous dead node.

  INFO [clojure-agent-send-off-pool-2] 2015-11-05 09:48:41,602 Attempting
to load stored metric values.
 ERROR [clojure-agent-send-off-pool-2] 2015-11-05 09:48:41,613 There was an
error when attempting to load stored rollups.
 com.datastax.driver.core.exceptions.DriverInternalError: Unexpected error
while processing response from /x.x.x.x:9042
at
com.datastax.driver.core.exceptions.DriverInternalError.copy(DriverInternalError.java:42)
at
com.datastax.driver.core.exceptions.DriverInternalError.copy(DriverInternalError.java:24)
...
Caused by: com.datastax.driver.core.exceptions.DriverInternalError:
Unexpected error while processing response from /x.x.x.x:9042
at
com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:150)
at
com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:183)
at
com.datastax.driver.core.RequestHandler.access$2300(RequestHandler.java:45)
...
Caused by: java.lang.IllegalStateException: Can't use this cluster instance
because it was previously closed
at com.datastax.driver.core.Cluster.checkNotClosed(Cluster.java:493)
at com.datastax.driver.core.Cluster.access$400(Cluster.java:61)
at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1231)
...
INFO [clojure-agent-send-off-pool-1] 2015-11-05 09:48:41,618 Attempting to
load stored metric values.
 ERROR [clojure-agent-send-off-pool-1] 2015-11-05 09:48:41,622 There was an
error when attempting to load stored rollups.
 com.datastax.driver.core.exceptions.InvalidQueryException: Invalid null
value for partition key part key
at
com.datastax.driver.core.exceptions.InvalidQueryException.copy(InvalidQueryException.java:35)
at
com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:291)

On Wed, Nov 4, 2015 at 8:43 PM, qihuang.zheng <qihuang.zh...@fraudmetrix.cn>
wrote:

> We have this problem with version 5.2.0.  so we decide to update to 5.2.2
>
> But this problem seems remain.  We solve this by totally delele relate
> agent file and process and restart. just like first time install.
>
>
> sudo kill -9 `ps -ef|grep datastax_agent_monitor | head -1 |awk '{print
> $2}'` && \
>
> sudo kill -9 `cat /var/run/datastax-agent/datastax-agent.pid` && \
>
> sudo rm -rf /var/lib/datastax-agent && \
>
> sudo rm -rf /usr/share/datastax-agent
>
> --
> qihuang.zheng
>
>  原始邮件
> *发件人:* Kai Wang<dep...@gmail.com>
> *收件人:* user<user@cassandra.apache.org>
> *发送时间:* 2015年11月5日(周四) 04:39
> *主题:* Can't save Opscenter Dashboard
>
> Hi,
>
> Today after one of the nodes is rebooted, OpsCenter dashboard doesn't save
> anymore. It starts with an empty dashboard with no widget or graph. If I
> add some graph/widget, they are being updated fine. But if I refresh the
> browser, the dashboard became empty again.
>
> Also there's no "DEFAULT" tab on the dashboard as the user guide shows. I
> am not sure if it was there before.
>

Re: Can't save Opscenter Dashboard

2015-11-04 Thread Kai Wang

No they don't.

On Wed, Nov 4, 2015 at 3:42 PM, Sebastian Estevez <
sebastian.este...@datastax.com> wrote:

> Do they come back if you restart opscenterd?
>
> All the best,
>
>
> [image: datastax_logo.png] <http://www.datastax.com/>
>
> Sebastián Estévez
>
> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
>
> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
> <https://twitter.com/datastax> [image: g+.png]
> <https://plus.google.com/+Datastax/about>
> <http://feeds.feedburner.com/datastax>
> <http://goog_410786983>
>
>
> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>
> DataStax is the fastest, most scalable distributed database technology,
> delivering Apache Cassandra to the world’s most innovative enterprises.
> Datastax is built to be agile, always-on, and predictably scalable to any
> size. With more than 500 customers in 45 countries, DataStax is the
> database technology and transactional backbone of choice for the worlds
> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>
> On Wed, Nov 4, 2015 at 3:41 PM, Kai Wang <dep...@gmail.com> wrote:
>
>> Forgot to mention. I am running OpsCenter 5.2.2.
>>
>> On Wed, Nov 4, 2015 at 3:39 PM, Kai Wang <dep...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> Today after one of the nodes is rebooted, OpsCenter dashboard doesn't
>>> save anymore. It starts with an empty dashboard with no widget or graph. If
>>> I add some graph/widget, they are being updated fine. But if I refresh the
>>> browser, the dashboard became empty again.
>>>
>>> Also there's no "DEFAULT" tab on the dashboard as the user guide shows.
>>> I am not sure if it was there before.
>>>
>>
>>
>

Re: Can't save Opscenter Dashboard

2015-11-04 Thread Kai Wang

Forgot to mention. I am running OpsCenter 5.2.2.

On Wed, Nov 4, 2015 at 3:39 PM, Kai Wang <dep...@gmail.com> wrote:

> Hi,
>
> Today after one of the nodes is rebooted, OpsCenter dashboard doesn't save
> anymore. It starts with an empty dashboard with no widget or graph. If I
> add some graph/widget, they are being updated fine. But if I refresh the
> browser, the dashboard became empty again.
>
> Also there's no "DEFAULT" tab on the dashboard as the user guide shows. I
> am not sure if it was there before.
>

Re: Can't save Opscenter Dashboard

2015-11-04 Thread Kai Wang

I am using Firefox. In web console I saw five messages (excluding Net and
CSS):

The character encoding of the HTML document was not declared. The document
will render with garbled text in some browser configurations if the
document contains characters from outside the US-ASCII range. The character
encoding of the page must be declared in the document or in the transfer
protocol. index.html

Synchronous XMLHttpRequest on the main thread is deprecated because of its
detrimental effects to the end user's experience. For more help
http://xhr.spec.whatwg.org/ dojo.js:110:0

baseurl#makeUrl called without calling setRoot first. ripcord.js:54:61

Use of getAttributeNode() is deprecated. Use getAttribute() instead.
dojo.js:175:0

This site makes use of a SHA-1 Certificate; it's recommended you use
certificates with signature algorithms that use hash functions stronger
than SHA-1.[Learn More] opscenter.datastax.com


On Wed, Nov 4, 2015 at 3:47 PM, Sebastian Estevez <
sebastian.este...@datastax.com> wrote:

> Are there any errors in your javascript console?
>
> All the best,
>
>
> [image: datastax_logo.png] <http://www.datastax.com/>
>
> Sebastián Estévez
>
> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
>
> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
> <https://twitter.com/datastax> [image: g+.png]
> <https://plus.google.com/+Datastax/about>
> <http://feeds.feedburner.com/datastax>
> <http://goog_410786983>
>
>
> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>
> DataStax is the fastest, most scalable distributed database technology,
> delivering Apache Cassandra to the world’s most innovative enterprises.
> Datastax is built to be agile, always-on, and predictably scalable to any
> size. With more than 500 customers in 45 countries, DataStax is the
> database technology and transactional backbone of choice for the worlds
> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>
> On Wed, Nov 4, 2015 at 3:46 PM, Kai Wang <dep...@gmail.com> wrote:
>
>> No they don't.
>>
>> On Wed, Nov 4, 2015 at 3:42 PM, Sebastian Estevez <
>> sebastian.este...@datastax.com> wrote:
>>
>>> Do they come back if you restart opscenterd?
>>>
>>> All the best,
>>>
>>>
>>> [image: datastax_logo.png] <http://www.datastax.com/>
>>>
>>> Sebastián Estévez
>>>
>>> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
>>>
>>> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
>>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
>>> <https://twitter.com/datastax> [image: g+.png]
>>> <https://plus.google.com/+Datastax/about>
>>> <http://feeds.feedburner.com/datastax>
>>> <http://goog_410786983>
>>>
>>>
>>> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>>>
>>> DataStax is the fastest, most scalable distributed database technology,
>>> delivering Apache Cassandra to the world’s most innovative enterprises.
>>> Datastax is built to be agile, always-on, and predictably scalable to any
>>> size. With more than 500 customers in 45 countries, DataStax is the
>>> database technology and transactional backbone of choice for the worlds
>>> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>>>
>>> On Wed, Nov 4, 2015 at 3:41 PM, Kai Wang <dep...@gmail.com> wrote:
>>>
>>>> Forgot to mention. I am running OpsCenter 5.2.2.
>>>>
>>>> On Wed, Nov 4, 2015 at 3:39 PM, Kai Wang <dep...@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Today after one of the nodes is rebooted, OpsCenter dashboard doesn't
>>>>> save anymore. It starts with an empty dashboard with no widget or graph. 
>>>>> If
>>>>> I add some graph/widget, they are being updated fine. But if I refresh the
>>>>> browser, the dashboard became empty again.
>>>>>
>>>>> Also there's no "DEFAULT" tab on the dashboard as the user guide
>>>>> shows. I am not sure if it was there before.
>>>>>
>>>>
>>>>
>>>
>>
>

Can't save Opscenter Dashboard

2015-11-04 Thread Kai Wang

Hi,

Today after one of the nodes is rebooted, OpsCenter dashboard doesn't save
anymore. It starts with an empty dashboard with no widget or graph. If I
add some graph/widget, they are being updated fine. But if I refresh the
browser, the dashboard became empty again.

Also there's no "DEFAULT" tab on the dashboard as the user guide shows. I
am not sure if it was there before.

Re: Can't save Opscenter Dashboard

2015-11-04 Thread Kai Wang

Reinstalling OpsCenter didn't fix it. Previously I thought the graphs were
updated but actually they were just stalled. I need to refresh and re-add
the graph to see the new metrics. Other information such as activities and
Nodes are updating fine.

On Wed, Nov 4, 2015 at 3:58 PM, Kai Wang <dep...@gmail.com> wrote:

> I am using Firefox. In web console I saw five messages (excluding Net and
> CSS):
>
> The character encoding of the HTML document was not declared. The document
> will render with garbled text in some browser configurations if the
> document contains characters from outside the US-ASCII range. The character
> encoding of the page must be declared in the document or in the transfer
> protocol. index.html
>
> Synchronous XMLHttpRequest on the main thread is deprecated because of its
> detrimental effects to the end user's experience. For more help
> http://xhr.spec.whatwg.org/ dojo.js:110:0
>
> baseurl#makeUrl called without calling setRoot first. ripcord.js:54:61
>
> Use of getAttributeNode() is deprecated. Use getAttribute() instead.
> dojo.js:175:0
>
> This site makes use of a SHA-1 Certificate; it's recommended you use
> certificates with signature algorithms that use hash functions stronger
> than SHA-1.[Learn More] opscenter.datastax.com
>
>
> On Wed, Nov 4, 2015 at 3:47 PM, Sebastian Estevez <
> sebastian.este...@datastax.com> wrote:
>
>> Are there any errors in your javascript console?
>>
>> All the best,
>>
>>
>> [image: datastax_logo.png] <http://www.datastax.com/>
>>
>> Sebastián Estévez
>>
>> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
>>
>> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
>> <https://twitter.com/datastax> [image: g+.png]
>> <https://plus.google.com/+Datastax/about>
>> <http://feeds.feedburner.com/datastax>
>> <http://goog_410786983>
>>
>>
>> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>>
>> DataStax is the fastest, most scalable distributed database technology,
>> delivering Apache Cassandra to the world’s most innovative enterprises.
>> Datastax is built to be agile, always-on, and predictably scalable to any
>> size. With more than 500 customers in 45 countries, DataStax is the
>> database technology and transactional backbone of choice for the worlds
>> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>>
>> On Wed, Nov 4, 2015 at 3:46 PM, Kai Wang <dep...@gmail.com> wrote:
>>
>>> No they don't.
>>>
>>> On Wed, Nov 4, 2015 at 3:42 PM, Sebastian Estevez <
>>> sebastian.este...@datastax.com> wrote:
>>>
>>>> Do they come back if you restart opscenterd?
>>>>
>>>> All the best,
>>>>
>>>>
>>>> [image: datastax_logo.png] <http://www.datastax.com/>
>>>>
>>>> Sebastián Estévez
>>>>
>>>> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
>>>>
>>>> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
>>>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
>>>> <https://twitter.com/datastax> [image: g+.png]
>>>> <https://plus.google.com/+Datastax/about>
>>>> <http://feeds.feedburner.com/datastax>
>>>> <http://goog_410786983>
>>>>
>>>>
>>>> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>>>>
>>>> DataStax is the fastest, most scalable distributed database
>>>> technology, delivering Apache Cassandra to the world’s most innovative
>>>> enterprises. Datastax is built to be agile, always-on, and predictably
>>>> scalable to any size. With more than 500 customers in 45 countries, 
>>>> DataStax
>>>> is the database technology and transactional backbone of choice for the
>>>> worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>>>>
>>>> On Wed, Nov 4, 2015 at 3:41 PM, Kai Wang <dep...@gmail.com> wrote:
>>>>
>>>>> Forgot to mention. I am running OpsCenter 5.2.2.
>>>>>
>>>>> On Wed, Nov 4, 2015 at 3:39 PM, Kai Wang <dep...@gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Today after one of the nodes is rebooted, OpsCenter dashboard doesn't
>>>>>> save anymore. It starts with an empty dashboard with no widget or graph. 
>>>>>> If
>>>>>> I add some graph/widget, they are being updated fine. But if I refresh 
>>>>>> the
>>>>>> browser, the dashboard became empty again.
>>>>>>
>>>>>> Also there's no "DEFAULT" tab on the dashboard as the user guide
>>>>>> shows. I am not sure if it was there before.
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Can't save Opscenter Dashboard

2015-11-04 Thread Kai Wang

Finally I got it working. I used to have auth information inside
address.yaml. I moved everything except for stomp_interface inside
address.yaml to [agent_config] section of cluster_name.conf and restarted
opscenterd and datastax-agent on all nodes. That seemed to fix the problem.

On Wed, Nov 4, 2015 at 5:36 PM, Kai Wang <dep...@gmail.com> wrote:

> Reinstalling OpsCenter didn't fix it. Previously I thought the graphs were
> updated but actually they were just stalled. I need to refresh and re-add
> the graph to see the new metrics. Other information such as activities and
> Nodes are updating fine.
>
> On Wed, Nov 4, 2015 at 3:58 PM, Kai Wang <dep...@gmail.com> wrote:
>
>> I am using Firefox. In web console I saw five messages (excluding Net and
>> CSS):
>>
>> The character encoding of the HTML document was not declared. The
>> document will render with garbled text in some browser configurations if
>> the document contains characters from outside the US-ASCII range. The
>> character encoding of the page must be declared in the document or in the
>> transfer protocol. index.html
>>
>> Synchronous XMLHttpRequest on the main thread is deprecated because of
>> its detrimental effects to the end user's experience. For more help
>> http://xhr.spec.whatwg.org/ dojo.js:110:0
>>
>> baseurl#makeUrl called without calling setRoot first. ripcord.js:54:61
>>
>> Use of getAttributeNode() is deprecated. Use getAttribute() instead.
>> dojo.js:175:0
>>
>> This site makes use of a SHA-1 Certificate; it's recommended you use
>> certificates with signature algorithms that use hash functions stronger
>> than SHA-1.[Learn More] opscenter.datastax.com
>>
>>
>> On Wed, Nov 4, 2015 at 3:47 PM, Sebastian Estevez <
>> sebastian.este...@datastax.com> wrote:
>>
>>> Are there any errors in your javascript console?
>>>
>>> All the best,
>>>
>>>
>>> [image: datastax_logo.png] <http://www.datastax.com/>
>>>
>>> Sebastián Estévez
>>>
>>> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
>>>
>>> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
>>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
>>> <https://twitter.com/datastax> [image: g+.png]
>>> <https://plus.google.com/+Datastax/about>
>>> <http://feeds.feedburner.com/datastax>
>>> <http://goog_410786983>
>>>
>>>
>>> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>>>
>>> DataStax is the fastest, most scalable distributed database technology,
>>> delivering Apache Cassandra to the world’s most innovative enterprises.
>>> Datastax is built to be agile, always-on, and predictably scalable to any
>>> size. With more than 500 customers in 45 countries, DataStax is the
>>> database technology and transactional backbone of choice for the worlds
>>> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>>>
>>> On Wed, Nov 4, 2015 at 3:46 PM, Kai Wang <dep...@gmail.com> wrote:
>>>
>>>> No they don't.
>>>>
>>>> On Wed, Nov 4, 2015 at 3:42 PM, Sebastian Estevez <
>>>> sebastian.este...@datastax.com> wrote:
>>>>
>>>>> Do they come back if you restart opscenterd?
>>>>>
>>>>> All the best,
>>>>>
>>>>>
>>>>> [image: datastax_logo.png] <http://www.datastax.com/>
>>>>>
>>>>> Sebastián Estévez
>>>>>
>>>>> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
>>>>>
>>>>> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
>>>>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
>>>>> <https://twitter.com/datastax> [image: g+.png]
>>>>> <https://plus.google.com/+Datastax/about>
>>>>> <http://feeds.feedburner.com/datastax>
>>>>> <http://goog_410786983>
>>>>>
>>>>>
>>>>> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>>>>>
>>>>> DataStax is the fastest, most scalable distributed database
>>>>> technology, delivering Apache Cassandra to the world’s most innovative
>>>>> enterprises. Datastax is built to be agile, always-on, and predictably
>>>>> scalable to any size. With more than 500 customers in 45 countries, 
>>>>> DataStax
>>>>> is the database technology and transactional backbone of choice for the
>>>>> worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>>>>>
>>>>> On Wed, Nov 4, 2015 at 3:41 PM, Kai Wang <dep...@gmail.com> wrote:
>>>>>
>>>>>> Forgot to mention. I am running OpsCenter 5.2.2.
>>>>>>
>>>>>> On Wed, Nov 4, 2015 at 3:39 PM, Kai Wang <dep...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Today after one of the nodes is rebooted, OpsCenter dashboard
>>>>>>> doesn't save anymore. It starts with an empty dashboard with no widget 
>>>>>>> or
>>>>>>> graph. If I add some graph/widget, they are being updated fine. But if I
>>>>>>> refresh the browser, the dashboard became empty again.
>>>>>>>
>>>>>>> Also there's no "DEFAULT" tab on the dashboard as the user guide
>>>>>>> shows. I am not sure if it was there before.
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Cassandra Data Model with Narrow partition

2015-10-30 Thread Kai Wang

agree with Carlos, you should bucket your key, for example, into (pk, day,
hour). Otherwise your partition is going to be large enough to cause
problems.

On Fri, Oct 30, 2015 at 8:04 AM, Carlos Alonso  wrote:

> Hi Chandra,
>
> Narrow partition is probably your best choice, but you need to bucket data
> somehow, otherwise your partitions will soon become unmanageable and you'll
> have problems reading them, both because the partitions will become very
> big and also because of the tombstones that your expired records will
> generate.
>
> In general having a partition that can grow indefinitely is a bad idea, so
> I'd advice you to use time based artificial bucketing to limit the maximum
> size of your partitions to be as close as possible to the recommendations.
>
> Also 120+ columns sounds like quite many, is there a way you can separate
> in different cfs or maybe use collections? I'd advice to do some
> benchmarking here: http://mrcalonso.com/benchmarking-cassandra-models/.
> This post is a bit outdated as nowadays you can use cassandra-stress with
> your own models, but the idea is the same.
>
> About compactions I'd use DTCS or LCS, but given that you will have a big
> amount of tombstones due to TTLs I'd never go with STCS.
>
> Hope it helps!
>
> Carlos Alonso | Software Engineer | @calonso 
>
> On 30 October 2015 at 10:55,  wrote:
>
>> Hi,
>>
>>
>>
>> Could you please suggest if Narrow partition is a  good choice for the
>> below use case.
>>
>>
>>
>> 1)  Write heavy event log table with 50m inserts per day with a peak
>> load of 20K transaction per sec. There aren’t any updates/deletes to
>> records inserted. Records are inserted with a TTL of 60 days (retention
>> period)
>>
>> 2)  The table has a single primary key which is a sequence number
>> (27 digits) generated by source application
>>
>> 3)  There are only two access patterns used – one by using the
>> sequence number & the other using sequence number + event date (range scans
>> also possible)
>>
>> 4)  My target data model in Cassandra is partitioned with sequence
>> number as the primary key + event date as clustering columns to enable
>> range scans on date.
>>
>> 5)  The Table has close to 120+ columns and the average row size
>> comes close to 32K bytes
>>
>> 6)  Reads are very very less and account to <5% while inserts can be
>> close to 95%.
>>
>> 7)  From a functional standpoint, I do not see any other columns
>> that can be part of primary key to keep the partition reasonable (<100MB)
>>
>>
>>
>> Questions:
>>
>> 1)  Is Narrow partition an ideal choice for the above use case.
>>
>> 2)  Is artificial bucketing an alternate choice to make the
>> partition reasonable
>>
>> 3)  We are using varint as the data type for sequence number which
>> is 27 digits long. Is DECIMAL data type ?
>>
>> 4)  Any suggestions on performance impacts during compaction ?
>>
>>
>>
>> Regards, Chandra Sekar KR
>>
>>
>> The information contained in this electronic message and any attachments
>> to this message are intended for the exclusive use of the addressee(s) and
>> may contain proprietary, confidential or privileged information. If you are
>> not the intended recipient, you should not disseminate, distribute or copy
>> this e-mail. Please notify the sender immediately and destroy all copies of
>> this message and any attachments. WARNING: Computer viruses can be
>> transmitted via email. The recipient should check this email and any
>> attachments for the presence of viruses. The company accepts no liability
>> for any damage caused by any virus transmitted by this email.
>> www.wipro.com
>>
>
>

Re: Error Code

2015-10-29 Thread Kai Wang

https://github.com/datastax/python-driver/blob/75ddc514617304797626cc69957eb6008695be1e/cassandra/connection.py#L573

Is your error message complete?

On Thu, Oct 29, 2015 at 9:45 AM, Eduardo Alfaia 
wrote:

> Hi Guys,
>
> Does anyone know what error code in cassandra is?
>
> Error decoding response from Cassandra. opcode: 0008;
>
> Thanks
>

Re: Oracle TIMESTAMP(9) equivalent in Cassandra

2015-10-29 Thread Kai Wang

If you want the timestamp to be generated on the C* side, you need to sync
clocks among nodes to the nanosecond precision first. That alone might be
hard or impossible already. I think the safe bet is to generate the
timestamp on the client side. But depending on your data volume, if data
comes from multiple clients you still need to sync clocks among them.


On Thu, Oct 29, 2015 at 7:57 AM,  wrote:

> Hi Doan,
>
>
>
> Is the timeBased() method available in Java driver similar to now() function
> in cqlsh. Does both provide identical results.
>
>
>
> Also, the preference is to generate values during record insertion from
> database side, rather than client side. Something similar to SYSTIMESTAMP
> in Oracle.
>
>
>
> Regards, Chandra Sekar KR
>
> *From:* DuyHai Doan [mailto:doanduy...@gmail.com]
> *Sent:* 29/10/2015 5:13 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Oracle TIMESTAMP(9) equivalent in Cassandra
>
>
>
> You can use TimeUUID data type and provide the value yourself from client
> side.
>
>
>
> The Java driver offers an utility class
> com.datastax.driver.core.utils.UUIDs and the method timeBased() to generate
> the TimeUUID.
>
>
>
>  The precision is only guaranteed up to 100 nano seconds. So you can have
> possibly 10k distincts values for 1 millsec. For your requirement of 20k
> per sec, it should be enough.
>
>
>
> On Thu, Oct 29, 2015 at 12:10 PM,  wrote:
>
> Hi,
>
>
>
> Oracle Timestamp data type supports fractional seconds (upto 9 digits, 6
> is default). What is the Cassandra equivalent data type for Oracle
> TimeStamp nanosecond precision.
>
>
>
> This is required for determining the order of insertion of record where
> the number of records inserted per sec is close to 20K. Is TIMEUUID an
> alternate functionality which can determine the order of record insertion
> in Cassandra ?
>
>
>
> Regards, Chandra Sekar KR
>
> The information contained in this electronic message and any attachments
> to this message are intended for the exclusive use of the addressee(s) and
> may contain proprietary, confidential or privileged information. If you are
> not the intended recipient, you should not disseminate, distribute or copy
> this e-mail. Please notify the sender immediately and destroy all copies of
> this message and any attachments. WARNING: Computer viruses can be
> transmitted via email. The recipient should check this email and any
> attachments for the presence of viruses. The company accepts no liability
> for any damage caused by any virus transmitted by this email.
> www.wipro.com
>
>
> The information contained in this electronic message and any attachments
> to this message are intended for the exclusive use of the addressee(s) and
> may contain proprietary, confidential or privileged information. If you are
> not the intended recipient, you should not disseminate, distribute or copy
> this e-mail. Please notify the sender immediately and destroy all copies of
> this message and any attachments. WARNING: Computer viruses can be
> transmitted via email. The recipient should check this email and any
> attachments for the presence of viruses. The company accepts no liability
> for any damage caused by any virus transmitted by this email.
> www.wipro.com
>

Re: Need company to support Cassandra on Windows

2015-10-28 Thread Kai Wang

I would start with DataStax. In this year's summit keynote Jonathan Ellis
said C* would start receiving production level support on Windows.

On Tue, Oct 27, 2015 at 9:58 AM, Troy Collinsworth <
troycollinswo...@gmail.com> wrote:

> Searching for a well established company that can provide consulting and
> operations support for private multi-dc production Cassandra cluster on
> Windows OS. New project. OS is hosting mandate.
>
> Troy Collinsworth
> 585-576-8761
>

how to grant permissions to OpsCenter keyspace?

2015-10-26 Thread Kai Wang

Hi,

My understanding is that if I want to enable internal authentication and
authorization on C* while still keeping OpsCenter working, I should grant
all to OpsCenter space and describe/select on everything else. But when I
try to grant permissions to or even switch into OpsCenter, cqlsh reports
this:

cassandra@cqlsh> use OpsCenter;
InvalidRequest: code=2200 [Invalid query] message="Keyspace 'opscenter'
does not exist"

KS OpsCenter of course exists. I notice cqlsh returns keyspace name in
lower case but in system.schema_keyspaces it shows OpsCenter. C* supports
case sensitive keyspace names in DevCenter. But cqlsh converts everything
to lower case?

How can I grant permission to OpsCenter? Or a further question, how can I
do case sensitive operations in cqlsh?

Thanks.

Re: how to grant permissions to OpsCenter keyspace?

2015-10-26 Thread Kai Wang

Thanks Adam.

On Mon, Oct 26, 2015 at 5:30 PM, Adam Holmberg <adam.holmb...@datastax.com>
wrote:

> You need to quote the "OpsCenter" identifier to distinguish capital
> letters:
> https://cassandra.apache.org/doc/cql3/CQL.html#identifiers
>
> Adam
>
> On Mon, Oct 26, 2015 at 4:25 PM, Kai Wang <dep...@gmail.com> wrote:
>
>> Hi,
>>
>> My understanding is that if I want to enable internal authentication and
>> authorization on C* while still keeping OpsCenter working, I should grant
>> all to OpsCenter space and describe/select on everything else. But when
>> I try to grant permissions to or even switch into OpsCenter, cqlsh
>> reports this:
>>
>> cassandra@cqlsh> use OpsCenter;
>> InvalidRequest: code=2200 [Invalid query] message="Keyspace 'opscenter'
>> does not exist"
>>
>> KS OpsCenter of course exists. I notice cqlsh returns keyspace name in
>> lower case but in system.schema_keyspaces it shows OpsCenter. C* supports
>> case sensitive keyspace names in DevCenter. But cqlsh converts everything
>> to lower case?
>>
>> How can I grant permission to OpsCenter? Or a further question, how can I
>> do case sensitive operations in cqlsh?
>>
>> Thanks.
>>
>
>

timestamp as clustering key doesn't work as expected

2015-10-23 Thread Kai Wang

Hi,

I use a timestamp column as the last clustering key so that I can run query
like "timestamp > ... AND timestamp < ...". But it doesn't work as
expected. Here is a simplified example.

My table:
CREATE TABLE test (
tag text,
group int,
timestamp timestamp,
value double,
PRIMARY KEY (tag, group, timestamp)
) WITH CLUSTERING ORDER BY (group ASC, timestamp DESC)

After inserting some data, here is my query:

cqlsh> select * from test where tag = 'MSFT' and group = 1 and timestamp
='2004-12-15 16:00:00-0500';

 tag  | group | timestamp| value
--+---+--+---
 MSFT | 1 | 2004-12-15 21:00:00+ | 27.11
 MSFT | 1 | 2004-12-16 21:00:00+ | 27.16
 MSFT | 1 | 2004-12-17 21:00:00+ | 26.96
 MSFT | 1 | 2004-12-20 21:00:00+ | 26.95
 MSFT | 1 | 2004-12-21 21:00:00+ | 27.07
 MSFT | 1 | 2004-12-22 21:00:00+ | 26.98
 MSFT | 1 | 2004-12-23 21:00:00+ | 27.01
 MSFT | 1 | 2004-12-27 21:00:00+ | 26.85
 MSFT | 1 | 2004-12-28 21:00:00+ | 26.95
 MSFT | 1 | 2004-12-29 21:00:00+ |  26.9
 MSFT | 1 | 2004-12-30 21:00:00+ | 26.76
(11 rows)

This doesn't make sense. I expect this query to return only the first row.
Why does it give me back rows with different timestamps? Did I
misunderstand how timestamp and clustering key work?

Thanks.

-Kai

Re: timestamp as clustering key doesn't work as expected

2015-10-23 Thread Kai Wang

Jon,

It's 2.1.10. I will see if I can reproduce it with a simple script.

Thanks.

On Fri, Oct 23, 2015 at 1:05 PM, Jon Haddad <j...@jonhaddad.com> wrote:

> What version of Cassandra?  I can’t think of a reason why you’d see this
> output.  If you can reliably reproduce, this should be filed as a JIRA.
> https://issues.apache.org/jira
>
>
>
> > On Oct 23, 2015, at 8:55 AM, Kai Wang <dep...@gmail.com> wrote:
> >
> > Hi,
> >
> > I use a timestamp column as the last clustering key so that I can run
> query like "timestamp > ... AND timestamp < ...". But it doesn't work as
> expected. Here is a simplified example.
> >
> > My table:
> > CREATE TABLE test (
> > tag text,
> > group int,
> > timestamp timestamp,
> > value double,
> > PRIMARY KEY (tag, group, timestamp)
> > ) WITH CLUSTERING ORDER BY (group ASC, timestamp DESC)
> >
> > After inserting some data, here is my query:
> >
> > cqlsh> select * from test where tag = 'MSFT' and group = 1 and timestamp
> ='2004-12-15 16:00:00-0500';
> >
> >  tag  | group | timestamp| value
> > --+---+--+---
> >  MSFT | 1 | 2004-12-15 21:00:00+ | 27.11
> >  MSFT | 1 | 2004-12-16 21:00:00+ | 27.16
> >  MSFT | 1 | 2004-12-17 21:00:00+ | 26.96
> >  MSFT | 1 | 2004-12-20 21:00:00+ | 26.95
> >  MSFT | 1 | 2004-12-21 21:00:00+ | 27.07
> >  MSFT | 1 | 2004-12-22 21:00:00+ | 26.98
> >  MSFT | 1 | 2004-12-23 21:00:00+ | 27.01
> >  MSFT | 1 | 2004-12-27 21:00:00+ | 26.85
> >  MSFT | 1 | 2004-12-28 21:00:00+ | 26.95
> >  MSFT | 1 | 2004-12-29 21:00:00+ |  26.9
> >  MSFT | 1 | 2004-12-30 21:00:00+ | 26.76
> > (11 rows)
> >
> > This doesn't make sense. I expect this query to return only the first
> row. Why does it give me back rows with different timestamps? Did I
> misunderstand how timestamp and clustering key work?
> >
> > Thanks.
> >
> > -Kai
>
>

Re: timestamp as clustering key doesn't work as expected

2015-10-23 Thread Kai Wang

https://issues.apache.org/jira/browse/CASSANDRA-10583

On Fri, Oct 23, 2015 at 1:26 PM, Kai Wang <dep...@gmail.com> wrote:

> Jon,
>
> It's 2.1.10. I will see if I can reproduce it with a simple script.
>
> Thanks.
>
> On Fri, Oct 23, 2015 at 1:05 PM, Jon Haddad <j...@jonhaddad.com> wrote:
>
>> What version of Cassandra?  I can’t think of a reason why you’d see this
>> output.  If you can reliably reproduce, this should be filed as a JIRA.
>> https://issues.apache.org/jira
>>
>>
>>
>> > On Oct 23, 2015, at 8:55 AM, Kai Wang <dep...@gmail.com> wrote:
>> >
>> > Hi,
>> >
>> > I use a timestamp column as the last clustering key so that I can run
>> query like "timestamp > ... AND timestamp < ...". But it doesn't work as
>> expected. Here is a simplified example.
>> >
>> > My table:
>> > CREATE TABLE test (
>> > tag text,
>> > group int,
>> > timestamp timestamp,
>> > value double,
>> > PRIMARY KEY (tag, group, timestamp)
>> > ) WITH CLUSTERING ORDER BY (group ASC, timestamp DESC)
>> >
>> > After inserting some data, here is my query:
>> >
>> > cqlsh> select * from test where tag = 'MSFT' and group = 1 and
>> timestamp ='2004-12-15 16:00:00-0500';
>> >
>> >  tag  | group | timestamp| value
>> > --+---+--+---
>> >  MSFT | 1 | 2004-12-15 21:00:00+ | 27.11
>> >  MSFT | 1 | 2004-12-16 21:00:00+ | 27.16
>> >  MSFT | 1 | 2004-12-17 21:00:00+ | 26.96
>> >  MSFT | 1 | 2004-12-20 21:00:00+ | 26.95
>> >  MSFT | 1 | 2004-12-21 21:00:00+ | 27.07
>> >  MSFT | 1 | 2004-12-22 21:00:00+ | 26.98
>> >  MSFT | 1 | 2004-12-23 21:00:00+ | 27.01
>> >  MSFT | 1 | 2004-12-27 21:00:00+ | 26.85
>> >  MSFT | 1 | 2004-12-28 21:00:00+ | 26.95
>> >  MSFT | 1 | 2004-12-29 21:00:00+ |  26.9
>> >  MSFT | 1 | 2004-12-30 21:00:00+ | 26.76
>> > (11 rows)
>> >
>> > This doesn't make sense. I expect this query to return only the first
>> row. Why does it give me back rows with different timestamps? Did I
>> misunderstand how timestamp and clustering key work?
>> >
>> > Thanks.
>> >
>> > -Kai
>>
>>
>

C* 2.1.10 failed to start

2015-10-19 Thread Kai Wang

It seems the same as https://issues.apache.org/jira/browse/CASSANDRA-8544.
It started to happen after bulkloading ~100G data and restarting.

Windows 2008 R2, JVM 1.8.0_60. It feels like C* didn't shutdown cleanly. Is
there any way to workaround this?

Thanks.

Re: C* 2.1.10 failed to start

2015-10-19 Thread Kai Wang

I fixed this by deleting everything in system\compactions_in_progress-
I wonder if there's any side effects by doing this.

On Mon, Oct 19, 2015 at 8:56 AM, Kai Wang <dep...@gmail.com> wrote:

> It seems the same as https://issues.apache.org/jira/browse/CASSANDRA-8544.
> It started to happen after bulkloading ~100G data and restarting.
>
> Windows 2008 R2, JVM 1.8.0_60. It feels like C* didn't shutdown cleanly.
> Is there any way to workaround this?
>
> Thanks.
>

OpsCenter issue with DCE 2.1.9

2015-10-09 Thread Kai Wang

Hi,

OpsCenter/Agent works sporadically for me. I am testing with DCE 2.1.9 on
Win7 x64. I seem to narrow it down to the following log messages.

When it works:
 INFO [Initialization] 2015-10-01 08:49:02,016 New JMX connection (
127.0.0.1:7199)
 ERROR [Initialization] 2015-10-01 08:49:02,344 Error connecting via JMX:
java.rmi.ConnectIOException: Exception creating connection to:
169.254.253.126; nested exception is:
java.net.SocketException: Network is unreachable: connect
  INFO [main] 2015-10-01 08:49:02,359 Reconnecting to a backup OpsCenter
instance

When it doesn't work:
  INFO [Initialization] 2015-10-09 16:57:43,008 New JMX connection (
127.0.0.1:7199)
 ERROR [Initialization] 2015-10-09 16:57:43,010 Error connecting via JMX:
java.rmi.ConnectIOException: Exception creating connection to:
169.254.253.126; nested exception is:
java.net.SocketException: Network is unreachable: connect
  INFO [Initialization] 2015-10-09 16:57:43,010 Sleeping for 20s before
trying to determine IP over JMX again

Where is this IP address 169.254.253.126 coming from? And what is a "backup
OpsCenter instance"?

Thanks.

Re: Timeout error in fetching million rows as results using clustering keys

2015-03-19 Thread Kai Wang

With your reading path and data model, it doesn't matter how many nodes you
have. All data with the same image_caseid is physically located on one node
(Well, on RF nodes but only one of those will try to server your query).
You are not taking advantage of Cassandra by creating hot spots on both
reading and writing. The first step I would take is to use
image_caseid-Area as the partition key. This breaks the query to small
parallel ones on partitions on different nodes.

On Wed, Mar 18, 2015 at 6:12 AM, Mehak Mehta meme...@cs.stonybrook.edu
wrote:

 ya I have cluster total 10 nodes but I am just testing with one node
 currently.
 Total data for all nodes will exceed 5 billion rows. But I may have memory
 on other nodes.

 On Wed, Mar 18, 2015 at 6:06 AM, Ali Akhtar ali.rac...@gmail.com wrote:

 4g also seems small for the kind of load you are trying to handle
 (billions of rows) etc.

 I would also try adding more nodes to the cluster.

 On Wed, Mar 18, 2015 at 2:53 PM, Ali Akhtar ali.rac...@gmail.com wrote:

 Yeah, it may be that the process is being limited by swap. This page:


 https://gist.github.com/aliakhtar/3649e412787034156cbb#file-cassandra-install-sh-L42

 Lines 42 - 48 list a few settings that you could try out for increasing
 / reducing the memory limits (assuming you're on linux).

 Also, are you using an SSD? If so make sure the IO scheduler is noop or
 deadline .

 On Wed, Mar 18, 2015 at 2:48 PM, Mehak Mehta meme...@cs.stonybrook.edu
 wrote:

 Currently Cassandra java process is taking 1% of cpu (total 8% is being
 used) and 14.3% memory (out of total 4G memory).
 As you can see there is not much load from other processes.

 Should I try changing default parameters of memory in Cassandra
 settings.

 On Wed, Mar 18, 2015 at 5:33 AM, Ali Akhtar ali.rac...@gmail.com
 wrote:

 What's your memory / CPU usage at? And how much ram + cpu do you have
 on this server?



 On Wed, Mar 18, 2015 at 2:31 PM, Mehak Mehta 
 meme...@cs.stonybrook.edu wrote:

 Currently there is only single node which I am calling directly with
 around 15 rows. Full data will be in around billions per node.
 The code is working only for size 100/200. Also the consecutive
 fetching is taking around 5-10 secs.

 I have a parallel script which is inserting the data while I am
 reading it. When I stopped the script it worked for 500/1000 but not more
 than that.



 On Wed, Mar 18, 2015 at 5:08 AM, Ali Akhtar ali.rac...@gmail.com
 wrote:

  If even 500-1000 isn't working, then your cassandra node might not
 be up.

 1) Try running nodetool status from shell on your cassandra server,
 make sure the nodes are up.

 2) Are you calling this on the same server where cassandra is
 running? Its trying to connect to localhost . If you're running it on a
 different server, try passing in the direct ip of your cassandra server.

 On Wed, Mar 18, 2015 at 2:05 PM, Mehak Mehta 
 meme...@cs.stonybrook.edu wrote:

 Data won't change much but queries will be different.
 I am not working on the rendering tool myself so I don't know much
 details about it.

 Also as suggested by you I tried to fetch data in size of 500 or
 1000 with java driver auto pagination.
 It fails when the number of records are high (around 10) with
 following error:

 Exception in thread main
 com.datastax.driver.core.exceptions.NoHostAvailableException: All 
 host(s)
 tried for query failed (tried: localhost/127.0.0.1:9042
 (com.datastax.driver.core.exceptions.DriverException: Timed out 
 waiting for
 server response))


 On Wed, Mar 18, 2015 at 4:47 AM, Ali Akhtar ali.rac...@gmail.com
 wrote:

 How often does the data change?

 I would still recommend a caching of some kind, but without
 knowing more details (how often the data is changing, what you're 
 doing
 with the 1m rows after getting them, etc) I can't recommend a 
 solution.

 I did see your other thread. I would also vote for elasticsearch /
 solr , they are more suited for the kind of analytics you seem to be 
 doing.
 Cassandra is more for storing data, it isn't all that great for 
 complex
 queries / analytics.

 If you want to stick to cassandra, you might have better luck if
 you made your range columns part of the primary key, so something like
 PRIMARY KEY(caseId, x, y)

 On Wed, Mar 18, 2015 at 1:41 PM, Mehak Mehta 
 meme...@cs.stonybrook.edu wrote:

 The rendering tool renders a portion a very large image. It may
 fetch different data each time from billions of rows.
 So I don't think I can cache such large results. Since same
 results will rarely fetched again.

 Also do you know how I can do 2d range queries using Cassandra.
 Some other users suggested me using Solr.
 But is there any way I can achieve that without using any other
 technology.

 On Wed, Mar 18, 2015 at 4:33 AM, Ali Akhtar ali.rac...@gmail.com
  wrote:

 Sorry, meant to say that way when you have to render, you can
 just display the latest cache.

 On Wed, Mar 18, 2015 at 1:30 PM, Ali Akhtar 
 ali.rac...@gmail.com wrote:

Re: Downgrade Cassandra from 2.1.x to 2.0.x

2015-03-06 Thread Kai Wang

AFAIK downgrading is not officially supported.

how much data do you have? If all possible I would dump all my data out and
bulk load them into the 2.0.x cluster. This is the only way I feel safe.
On Mar 6, 2015 5:55 AM, Roni Balthazar ronibaltha...@gmail.com wrote:

 Hi there,

 What is the best way to downgrade a C* 2.1.3 cluster to the stable 2.0.12?
 I know it's not supported, but we are getting too many issues with the
 2.1.x...
 It is leading us to think that the best solution is to use the stable
 version.
 Is there a safe way to do that?

 Cheers,

 Roni

Re: Composite Keys in cassandra 1.2

2015-03-03 Thread Kai Wang

This is a tough one. One thing I can think of is to use Spark/Spark SQL to
run ad-hoc queries on C* cluster. You can post on Spark Cassandra
Connector user group.

On Tue, Mar 3, 2015 at 10:18 AM, Yulian Oifa oifa.yul...@gmail.com wrote:

 Hello
 Initially problem is that customer wants to have an option for ANY query ,
 which does not fits good with NOSQL.However the size of data is too big for
 Relational DB.
 There are no typical queries on the data, there are 10 fields , based on
 which ( any mix of them also ) queries should be made.
 Till now i allowed only single field query ( in specific cases with 2
 fields ) so i had indexes CFs for each field and that solved a
 problem.Since now i need compounds , sometimes 5-6 fields i either need to
 iterate over index CF based on some column ( or may be read several indexes
 and find common ids ) or create some index that will allow me to read data
 based on any part. Creating index for each group of fields of course is not
 an option since number of indexes will be huge , and disk usage will be too
 big.

 Best regards
 Yulian Oifa

 On Mon, Mar 2, 2015 at 5:33 PM, Kai Wang dep...@gmail.com wrote:

 AFIK it's not possible. The fact you need to query the data by partial
 row key indicates your data model isn't proper. What are your typical
 queries on the data?

 On Sun, Mar 1, 2015 at 7:24 AM, Yulian Oifa oifa.yul...@gmail.com
 wrote:

 Hello to all.
 Lets assume a scenario where key is compound type with 3 types in it (
 Long , UTF8, UTF8 ).
 Each row stores timeuuids as column names and empty values.
 Is it possible to retreive data by single key part ( for example by long
 only ) by using java thrift?

 Best regards
 Yulian Oifa

Re: Composite Keys in cassandra 1.2

2015-03-02 Thread Kai Wang

AFIK it's not possible. The fact you need to query the data by partial row
key indicates your data model isn't proper. What are your typical queries
on the data?

On Sun, Mar 1, 2015 at 7:24 AM, Yulian Oifa oifa.yul...@gmail.com wrote:

 Hello to all.
 Lets assume a scenario where key is compound type with 3 types in it (
 Long , UTF8, UTF8 ).
 Each row stores timeuuids as column names and empty values.
 Is it possible to retreive data by single key part ( for example by long
 only ) by using java thrift?

 Best regards
 Yulian Oifa

Re: Data tiered compaction and data model question

2015-02-19 Thread Kai Wang

What's the typical size of the data field? Unless it's very large, I don't
think table 2 is a very wide row (10x20x60x24=288000 events/partition at
worst). Plus you only need to store 30 days of data. The over data size is
288000x30=8,640,000 events. I am not even sure if you need C* depending on
event size.

On Thu, Feb 19, 2015 at 12:00 AM, cass savy casss...@gmail.com wrote:

 10-20 per minute is the average. Worstcase can be 10x of avg.

 On Wed, Feb 18, 2015 at 4:49 PM, Mohammed Guller moham...@glassbeam.com
 wrote:

  What is the maximum number of events that you expect in a day? What is
 the worst-case scenario?



 Mohammed



 *From:* cass savy [mailto:casss...@gmail.com]
 *Sent:* Wednesday, February 18, 2015 4:21 PM
 *To:* user@cassandra.apache.org
 *Subject:* Data tiered compaction and data model question



 We want to track events in log  Cf/table and should be able to query for
 events that occurred in range of mins or hours for given day. Multiple
 events can occur in a given minute.  Listed 2 table designs and leaning
 towards table 1 to avoid large wide row.  Please advice on



 *Table 1*: not very widerow, still be able to query for range of minutes
 for given day

 and/or given day and range of hours

 Create table *log_Event*

 (

  event_day text,

  event_hr int,

  event_time timeuuid,

  data text,

 PRIMARY KEY (* (event_day,event_hr),*event_time)

 )

 *Table 2: This will be very wide row*



 Create table *log_Event*

 ( event_day text,

  event_time timeuuid,

  data text,

 PRIMARY KEY (* event_day,*event_time)

 )



 *Datatiered compaction: recommended for time series data as per below
 doc. Our data will be kept only for 30 days. Hence thought of using this
 compaction strategy.*

 http://www.datastax.com/dev/blog/datetieredcompactionstrategy

 Create table 1 listed above with this compaction strategy. Added some
 rows and did manual flush.  I do not see any sstables created yet. Is that
 expected?

  compaction={'max_sstable_age_days': '1', 'class':
 'DateTieredCompactionStrategy'}

Re: How to connect to Opscenter from outside the cloud?

2015-02-17 Thread Kai Wang

You can start from here:
http://www.datastax.com/docs/1.1/references/firewall_ref

By default ops site is hosted at port .

On Tue, Feb 17, 2015 at 12:38 PM, Syed, Basit B. (NSN - FI/Espoo) 
basit.b.s...@nsn.com wrote:

  Hi,
 I have a two  node cluster running on openstack cloud. One of the node is
 also running Opscenter, while both are running datastax-agents.

 How can I use browser on my Windows machine to connect to this instance of
 opscenter? Specifically, I want to ask, which ports should I open in
 default security group to make it happen?

 Regards,
 Basit

 Datacenter: Cassandra
 =
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address  Load   Tokens  Owns   Host
 ID   Rack
 UN  192.168.2.6  26.45 MB   1   14.8%
 95bbe8a0-942b-4408-b152-88a1c4f4e2de  rack1
 UN  192.168.2.4  9.92 GB1   85.2%
 2527f7e0-e2f6-41d5-b6c1-48d1d922ef8e  rack1

Re: Upgrading from 1.2 to 2.1 questions

2015-02-02 Thread Kai Wang

I would not use 2.1.2 for production yet. It doesn't seem stable enough
based on the feedbacks I see here. The newest 2.0.12 may be a better option.
On Feb 2, 2015 8:43 AM, Sibbald, Charles charles.sibb...@bskyb.com
wrote:

 Hi Oleg,

 What is the minor version of 1.2? I am looking to do the same for 1.2.14
 in a very large cluster.

 Regards

 Charles


 On 02/02/2015 13:33, Oleg Dulin oleg.du...@gmail.com wrote:

 Dear Distinguished Colleagues:
 
 We'd like to upgrade our cluster from 1.2 to 2.0 and then to 2.1 .
 
 We are using Pelops Thrift client, which has long been abandoned by its
 authors. I've read that 2.x has changes to the Thrift protocol making
 it incompatible with 1.2 (and of course now the link to that site
 eludes me). If that is true, we need to first upgrade our Thrift client
 and then upgrade cassandra.
 
 Let's start by confirming if that indeed is the case -- if that is
 true, I have my work cut out for me.
 
 Anyone knows for sure ?
 
 Regards,
 Oleg
 
 

 Information in this email including any attachments may be privileged,
 confidential and is intended exclusively for the addressee. The views
 expressed may not be official policy, but the personal views of the
 originator. If you have received it in error, please notify the sender by
 return e-mail and delete it from your system. You should not reproduce,
 distribute, store, retransmit, use or disclose its contents to anyone.
 Please note we reserve the right to monitor all e-mail communication
 through our internal and external networks. SKY and the SKY marks are
 trademarks of British Sky Broadcasting Group plc and Sky International AG
 and are used under licence. British Sky Broadcasting Limited (Registration
 No. 2906991), Sky-In-Home Service Limited (Registration No. 2067075) and
 Sky Subscribers Services Limited (Registration No. 2340150) are direct or
 indirect subsidiaries of British Sky Broadcasting Group plc (Registration
 No. 2247735). All of the companies mentioned in this paragraph are
 incorporated in England and Wales and share the same registered office at
 Grant Way, Isleworth, Middlesex TW7 5QD.

Re: Is there a way to add a new node to a cluster but not sync old data?

2015-01-22 Thread Kai Wang

In last year's summit there was a presentation from Instaclustr -
https://www.instaclustr.com/meetups/presentation-by-ben-bromhead-at-cassandra-summit-2014-san-francisco/.
It could be the solution you are looking for. However I don't see the code
being checked in or JIRA being created. So for now you'd better plan the
capacity carefully.

On Wed, Jan 21, 2015 at 11:21 PM, Yatong Zhang bluefl...@gmail.com wrote:

Yes, my cluster is almost full and there are lots of pending tasks. You
helped me a lot and thank you Eric~

On Thu, Jan 22, 2015 at 11:59 AM, Eric Stevens migh...@gmail.com wrote:

Yes, bootstrapping a new node will cause read loads on your existing
nodes - it is becoming the owner and replica of a whole new set of existing
data. To do that it needs to know what data it's now responsible for, and
that's what bootstrapping is for.

If you're at the point where bootstrapping a new node is placing a
too-heavy burden on your existing nodes, you may be dangerously close to or
even past the tipping point where you ought to have already grown your
cluster. You need to grow your cluster as soon as possible, and chances
are you're close to no longer being able to keep up with compaction (see
nodetool compactionstats, make sure pending tasks is 5, preferably 0 or
1). Once you're falling behind on compaction, it becomes difficult to
successfully bootstrap new nodes, and you're in a very tough spot.

On Wed, Jan 21, 2015 at 7:43 PM, Yatong Zhang bluefl...@gmail.com
wrote:

Thanks for the reply. The bootstrap of new node put a heavy burden on
the whole cluster and I don't know why. So that' the issue I want to fix
actually.

On Mon, Jan 12, 2015 at 6:08 AM, Eric Stevens migh...@gmail.com wrote:

Yes, but it won't do what I suspect you're hoping for. If you disable
auto_bootstrap in cassandra.yaml the node will join the cluster and will
not stream any old data from existing nodes.

The cluster will now be in an inconsistent state. If you bring enough
nodes online this way to violate your read consistency level (eg RF=3,
CL=Quorum, if you bring on 2 nodes this way), some of your queries will be
missing data that they ought to have returned.

There is no way to bring a new node online and have it be responsible
just for new data, and have no responsibility for old data. It *will* be
responsible for old data, it just won't *know* about the old data it
should be responsible for. Executing a repair will fix this, but only
because the existing nodes will stream all the missing data to the new
node. This will create more pressure on your cluster than just normal
bootstrapping would have.

I can't think of any reason you'd want to do that unless you needed to
grow your cluster really quickly, and were ok with corrupting your old
data.

On Sat, Jan 10, 2015 at 12:39 AM, Yatong Zhang bluefl...@gmail.com
wrote:

Hi there,

I am using C* 2.0.10 and I was trying to add a new node to a
cluster(actually replace a dead node). But after added the new node some
other nodes in the cluster had a very high work-load and affected the
whole
performance of the cluster.
So I am wondering is there a way to add a new node and this node only
afford new data?

Re: Versioning in cassandra while indexing ?

2015-01-21 Thread Kai Wang

depending on your data model, static column night be useful.
https://issues.apache.org/jira/plugins/servlet/mobile#issue/CASSANDRA-6561
On Jan 21, 2015 2:56 AM, Pandian R pandian4m...@gmail.com wrote:

 Hi,

 I just wanted to know if there is any kind of versioning system in
 cassandra while indexing new data(like the one we have for ElasticSearch,
 for example).

 For example, I have a series of payloads each coming with an id and
 'updatedAt' timestamp. I just want to maintain the latest state of any
 payload for all the ids ie, index the data only if the current payload has
 greater 'updatedAt' than the previously stored timestamp. I can do this
 with one additional self-lookup, but is there a way to achieve this without
 overhead of additional lookup ?

 Thanks !

 --
 Regards,
 Pandian

Re: CQL3 vs Thrift

2014-12-24 Thread Kai Wang

Ryan,

Can you elaborate a little on Thrift over CQL is modeling clustering
columns in different nesting between rows is trivial in Thrift and not
really doable in CQL?
On Dec 24, 2014 8:30 AM, Ryan Svihla rsvi...@datastax.com wrote:

 I'm not entirely certain how you can't model that to solve your use case
 (wouldn't you be filtering the events as well, and therefore be able to get
 all that in one query).

  What you describe there has a number of avenues (collections, just
 heavier use of statics in a different order than you specified, object dump
 of events in a single column, switching up the clustering columns) of
 getting your question answered in one query. End of the day cql resolves to
 a given SStable format, you can still open up cassandra-cli and view what a
 given model looks like, when you've grokked this adequately you basically
 can bend CQL to fit your logical thrift modeling, at some point like
 learning any new language you'll learn to speak in both ( something I have
 to do nearly daily).

 FWIW other than the primary valid complaint remaining for Thrift over CQL
 is modeling clustering columns in different nesting between rows is trivial
 in Thrift and not really doable in CQL (clustering columns enforce a
 nesting order by logical construct), I've yet to not be able to swap a
 client from thrift to CQL ,and it's always ended up faster (so far).

 The main reason for this is performance on modern Cassandra and the native
 protocol is substantially better than pure thrift for many query types (see
 http://www.datastax.com/dev/blog/cassandra-2-1-now-over-50-faster) , so
 your mileage may vary, but I'd test it out first before proclaiming that
 thrift is faster for your use case (and make liberal use of cql features
 with cassandra-cli to make sure you know what's going on internally,
 remember it's all just sstables underneath).




 On Tue, Dec 23, 2014 at 12:00 PM, David Broyles sj.clim...@gmail.com
 wrote:

 Thanks, Ryan.  I wasn't aware of static column support, and indeed they
 get me most of what I need.  I think the only potential inefficiency  is
 still at query time.  Using Thrift, I could design the column family to get
 the all the static and dynamic content in a single query.
 If event_source and total_events are instead implemented as CQL3 statics,
 I probably need to do two queries to get data for a given event_type

 To get event metadata (is the LIMIT 1 needed to reduce to 1 record?):
 SELECT event_source, total_events FROM timeseries WHERE event_type =
 'some-type'

 To get the events:
 SELECT insertion_time, event FROM timeseries

 As a combined query, my concern is related to the overhead of repeating
 event_type/source/total_events (although with potentially many other pieces
 of static information).

 More generally, do you find that tuned applications tend to use Thrift, a
 combination of Thrift and CQL3, or is CQL3 really expected to replace
 Thrift?

 Thanks again!

 On Mon, Dec 22, 2014 at 9:50 PM, Ryan Svihla rsvi...@datastax.com
 wrote:

 Don't static columns get you what you want?


 http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/refStaticCol.html
  On Dec 22, 2014 10:50 PM, David Broyles sj.clim...@gmail.com wrote:

 Although I used Cassandra 1.0.X extensively, I'm new to CQL3.  Pages
 such as http://wiki.apache.org/cassandra/ClientOptionsThrift suggest
 new projects should use CQL3.

 I'm wondering, however, if there are certain use cases not well covered
 by CQL3.  Consider the standard timeseries example:

 CREATE TABLE timeseries (
event_type text,
insertion_time timestamp,
event blob,
PRIMARY KEY (event_type, insertion_time)
 ) WITH CLUSTERING ORDER BY (insertion_time DESC);

 What happens if I want to store additional information that is shared
 by all events in the given series (but that I don't want to include in the
 row ID): e.g. the event source, a cached count of the number of events
 logged to date, etc.?  I might try updating the definition as follows:

 CREATE TABLE timeseries (
event_type text,
   event_source text,
total_events int,
insertion_time timestamp,
event blob,
PRIMARY KEY (event_type, event_source, total_events, insertion_time)
 ) WITH CLUSTERING ORDER BY (insertion_time DESC);

 Is this not inefficient?  When inserting or querying via CQL3, say in
 batches of up to 1000 events, won't the type/source/count be repeated 1000
 times?  Please let me know if I'm misunderstanding something, or if I
 should be sticking to Thrift for situations like this involving mixed
 static/dynamic data.

 Thanks!





 --

 [image: datastax_logo.png] http://www.datastax.com/

 Ryan Svihla

 Solution Architect

 [image: twitter.png] https://twitter.com/foundev [image: linkedin.png]
 http://www.linkedin.com/pub/ryan-svihla/12/621/727/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is

Connect to C* instance inside virtualbox

2014-12-22 Thread Kai Wang

I installed C* in virtualbox via vagrant. Both 9160 and 9042 ports are
forwarded from guest to host. I can telnet to those two ports from host to
guest. But from my host, I can't connect to C* using cassandra-cli or
cqlsh. My host is Windows 7 64bit and guest is CentOS 6.5.

Is there anything special about connecting to a C* instance inside
virtualbox?

Re: Connect to C* instance inside virtualbox

2014-12-22 Thread Kai Wang

Ryan,

it works! I saw this new config mentioned in Cassandra summit 2014 but
didn't realize it applied in my case.

Thanks.

On Mon, Dec 22, 2014 at 4:43 PM, Ryan Svihla rsvi...@datastax.com wrote:

 what is rpc_address set to in cassandra.yaml? my gut is localhost, set it
 to the interface that communicates between host and guest.

 On Mon, Dec 22, 2014 at 3:38 PM, Kai Wang dep...@gmail.com wrote:

 I installed C* in virtualbox via vagrant. Both 9160 and 9042 ports are
 forwarded from guest to host. I can telnet to those two ports from host to
 guest. But from my host, I can't connect to C* using cassandra-cli or
 cqlsh. My host is Windows 7 64bit and guest is CentOS 6.5.

 Is there anything special about connecting to a C* instance inside
 virtualbox?




 --

 [image: datastax_logo.png] http://www.datastax.com/

 Ryan Svihla

 Solution Architect

 [image: twitter.png] https://twitter.com/foundev [image: linkedin.png]
 http://www.linkedin.com/pub/ryan-svihla/12/621/727/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.

Re: Connect to C* instance inside virtualbox

2014-12-22 Thread Kai Wang

Ryan,

Actually after I made the change, I was able to connect to C* from host but
not from guest anymore. Is this expected?

On Mon, Dec 22, 2014 at 8:53 PM, Kai Wang dep...@gmail.com wrote:

 Ryan,

 it works! I saw this new config mentioned in Cassandra summit 2014 but
 didn't realize it applied in my case.

 Thanks.

 On Mon, Dec 22, 2014 at 4:43 PM, Ryan Svihla rsvi...@datastax.com wrote:

 what is rpc_address set to in cassandra.yaml? my gut is localhost, set it
 to the interface that communicates between host and guest.

 On Mon, Dec 22, 2014 at 3:38 PM, Kai Wang dep...@gmail.com wrote:

 I installed C* in virtualbox via vagrant. Both 9160 and 9042 ports are
 forwarded from guest to host. I can telnet to those two ports from host to
 guest. But from my host, I can't connect to C* using cassandra-cli or
 cqlsh. My host is Windows 7 64bit and guest is CentOS 6.5.

 Is there anything special about connecting to a C* instance inside
 virtualbox?




 --

 [image: datastax_logo.png] http://www.datastax.com/

 Ryan Svihla

 Solution Architect

 [image: twitter.png] https://twitter.com/foundev [image: linkedin.png]
 http://www.linkedin.com/pub/ryan-svihla/12/621/727/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.

Re: Connect to C* instance inside virtualbox

2014-12-22 Thread Kai Wang

on the guest where C* is installed, I run cqlsh without any argument. When
I enabled rpc_interface, cqlsh returned can't connect 127.0.0.1:9042.

On Mon, Dec 22, 2014 at 9:01 PM, Ryan Svihla rsvi...@datastax.com wrote:

 totally depends on how the implementation is handled in virtualbox, I'm
 assuming you're connecting to an IP that makes sense on the guest (ie
 nodetool -h 192.168.1.100 and cqlsh 192.168.1.100, replace that ip with
 whatever what you expect)?

 On Mon, Dec 22, 2014 at 7:58 PM, Kai Wang dep...@gmail.com wrote:

 Ryan,

 Actually after I made the change, I was able to connect to C* from host
 but not from guest anymore. Is this expected?

 On Mon, Dec 22, 2014 at 8:53 PM, Kai Wang dep...@gmail.com wrote:

 Ryan,

 it works! I saw this new config mentioned in Cassandra summit 2014 but
 didn't realize it applied in my case.

 Thanks.

 On Mon, Dec 22, 2014 at 4:43 PM, Ryan Svihla rsvi...@datastax.com
 wrote:

 what is rpc_address set to in cassandra.yaml? my gut is localhost, set
 it to the interface that communicates between host and guest.

 On Mon, Dec 22, 2014 at 3:38 PM, Kai Wang dep...@gmail.com wrote:

 I installed C* in virtualbox via vagrant. Both 9160 and 9042 ports are
 forwarded from guest to host. I can telnet to those two ports from host to
 guest. But from my host, I can't connect to C* using cassandra-cli or
 cqlsh. My host is Windows 7 64bit and guest is CentOS 6.5.

 Is there anything special about connecting to a C* instance inside
 virtualbox?




 --

 [image: datastax_logo.png] http://www.datastax.com/

 Ryan Svihla

 Solution Architect

 [image: twitter.png] https://twitter.com/foundev [image:
 linkedin.png] http://www.linkedin.com/pub/ryan-svihla/12/621/727/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.






 --

 [image: datastax_logo.png] http://www.datastax.com/

 Ryan Svihla

 Solution Architect

 [image: twitter.png] https://twitter.com/foundev [image: linkedin.png]
 http://www.linkedin.com/pub/ryan-svihla/12/621/727/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.

Re: Connect to C* instance inside virtualbox

2014-12-22 Thread Kai Wang

I might misread the comment but I thought I could only set rpc_interface or
rpc_address but not both. So I didn't set rpc_addresa. Will double check
tomorrow. Thanks.
On Dec 22, 2014 9:17 PM, Ryan Svihla rsvi...@datastax.com wrote:

 if this helps..what did you change rpc_address to?

 On Mon, Dec 22, 2014 at 8:15 PM, Ryan Svihla rsvi...@datastax.com wrote:

 right that's localhost, you have to change it to match the ip of whatever
 you changed rpc_address too

 On Mon, Dec 22, 2014 at 8:07 PM, Kai Wang dep...@gmail.com wrote:

 on the guest where C* is installed, I run cqlsh without any argument.
 When I enabled rpc_interface, cqlsh returned can't connect
 127.0.0.1:9042.

 On Mon, Dec 22, 2014 at 9:01 PM, Ryan Svihla rsvi...@datastax.com
 wrote:

 totally depends on how the implementation is handled in virtualbox, I'm
 assuming you're connecting to an IP that makes sense on the guest (ie
 nodetool -h 192.168.1.100 and cqlsh 192.168.1.100, replace that ip with
 whatever what you expect)?

 On Mon, Dec 22, 2014 at 7:58 PM, Kai Wang dep...@gmail.com wrote:

 Ryan,

 Actually after I made the change, I was able to connect to C* from
 host but not from guest anymore. Is this expected?

 On Mon, Dec 22, 2014 at 8:53 PM, Kai Wang dep...@gmail.com wrote:

 Ryan,

 it works! I saw this new config mentioned in Cassandra summit 2014
 but didn't realize it applied in my case.

 Thanks.

 On Mon, Dec 22, 2014 at 4:43 PM, Ryan Svihla rsvi...@datastax.com
 wrote:

 what is rpc_address set to in cassandra.yaml? my gut is localhost,
 set it to the interface that communicates between host and guest.

 On Mon, Dec 22, 2014 at 3:38 PM, Kai Wang dep...@gmail.com wrote:

 I installed C* in virtualbox via vagrant. Both 9160 and 9042 ports
 are forwarded from guest to host. I can telnet to those two ports from 
 host
 to guest. But from my host, I can't connect to C* using cassandra-cli 
 or
 cqlsh. My host is Windows 7 64bit and guest is CentOS 6.5.

 Is there anything special about connecting to a C* instance inside
 virtualbox?




 --

 [image: datastax_logo.png] http://www.datastax.com/

 Ryan Svihla

 Solution Architect

 [image: twitter.png] https://twitter.com/foundev [image:
 linkedin.png] http://www.linkedin.com/pub/ryan-svihla/12/621/727/

 DataStax is the fastest, most scalable distributed database
 technology, delivering Apache Cassandra to the world’s most innovative
 enterprises. Datastax is built to be agile, always-on, and predictably
 scalable to any size. With more than 500 customers in 45 countries, 
 DataStax
 is the database technology and transactional backbone of choice for the
 worlds most innovative companies such as Netflix, Adobe, Intuit, and 
 eBay.






 --

 [image: datastax_logo.png] http://www.datastax.com/

 Ryan Svihla

 Solution Architect

 [image: twitter.png] https://twitter.com/foundev [image:
 linkedin.png] http://www.linkedin.com/pub/ryan-svihla/12/621/727/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.





 --

 [image: datastax_logo.png] http://www.datastax.com/

 Ryan Svihla

 Solution Architect

 [image: twitter.png] https://twitter.com/foundev [image: linkedin.png]
 http://www.linkedin.com/pub/ryan-svihla/12/621/727/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.




 --

 [image: datastax_logo.png] http://www.datastax.com/

 Ryan Svihla

 Solution Architect

 [image: twitter.png] https://twitter.com/foundev [image: linkedin.png]
 http://www.linkedin.com/pub/ryan-svihla/12/621/727/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.

Re: Replacing nodes disks

2014-12-18 Thread Kai Wang

do you have to replace those disks? can you simply add new disks to those
nodes and configure C* to use JBOD?
On Dec 18, 2014 10:18 AM, Or Sher or.sh...@gmail.com wrote:

 Hi all,

 We have a situation where some of our nodes have smaller disks and we
 would like to align all nodes by replacing the smaller disks to bigger ones
 without replacing nodes.
 We don't have enough space to put data on / disk and copy it back to the
 bigger disks so we would like to rebuild the nodes data from other replicas.

 What do you think should be the procedure here?

 I'm guessing it should be something like this but I'm pretty sure it's not
 enough.
 1. shutdown C* node and server.
 2. replace disks + create the same vg lv etc.
 3. start C* (Normally?)
 4. nodetool repair/rebuild?
 *I think I might get some consistency issues for use cases relying on
 Quorum reads and writes for strong consistency.
 What do you say?

 Another question is (and I know it depends on many factors but I'd like to
 hear an experienced estimation): How much time would take to rebuild a 250G
 data node?

 Thanks in advance,
 Or.

 --
 Or Sher

Re: How to model data to achieve specific data locality

2014-12-09 Thread Kai Wang

Some of the sequences grow so fast that sub-partition is inevitable. I may
need to try different bucket sizes to get the optimal throughput. Thank you
all for the advice.

On Mon, Dec 8, 2014 at 9:55 AM, Eric Stevens migh...@gmail.com wrote:

 The upper bound for the data size of a single column is 2GB, and the upper
 bound for the number of columns in a row (partition) is 2 billion.  So if
 you wanted to create the largest possible row, you probably can't afford
 enough disks to hold it.
 http://wiki.apache.org/cassandra/CassandraLimitations

 Practically speaking you start running into troubles *way* before you
 reach those thresholds though.  Large columns and large numbers of columns
 create GC pressure in your cluster, and since all data for a given row
 reside on the same primary and replicas, this tends to lead to hot
 spotting.  Repair happens for entire rows, so large rows increase the cost
 of repairs, including GC pressure during the repair.  And rows of this size
 are often arrived at by appending to the same row repeatedly, which will
 cause the data for that row to be scattered across a large number of
 SSTables which will hurt read performance. Also depending on your
 interface, you'll find you start hitting limits that you have to increase,
 each with their own implications (eg, maximum thrift message sizes and so
 forth).  The right maximum practical size for a row definitely depends on
 your read and write patterns, as well as your hardware and network.  More
 memory, SSD's, larger SSTables, and faster networks will all raise the
 ceiling for where large rows start to become painful.

 @Kai, if you're familiar with the Thrift paradigm, the partition key
 equates to a Thrift row key, and the clustering key equates to the first
 part of a composite column name.  CQL PRIMARY KEY ((a,b), c, d) equates to
 Thrift where row key is ['a:b'] and all columns begin with ['c:d:'].
 Recommended reading: http://www.datastax.com/dev/blog/thrift-to-cql3

 Whatever your partition key, if you need to sub-partition to maintain
 reasonable row sizes, then the only way to preserve data locality for
 related records is probably to switch to byte ordered partitioner, and
 compute blob or long column as part of your partition key that is meant to
 cause the PK to to map to the same token.  Just be aware that byte ordered
 partitioner comes with a number of caveats, and you'll become responsible
 for maintaining good data load distributions in your cluster. But the
 benefits from being able to tune locality may be worth it.


 On Sun Dec 07 2014 at 3:12:11 PM Jonathan Haddad j...@jonhaddad.com
 wrote:

 I think he mentioned 100MB as the max size - planning for 1mb might make
 your data model difficult to work.

 On Sun Dec 07 2014 at 12:07:47 PM Kai Wang dep...@gmail.com wrote:

 Thanks for the help. I wasn't clear how clustering column works. Coming
 from Thrift experience, it took me a while to understand how clustering
 column impacts partition storage on disk. Now I believe using seq_type as
 the first clustering column solves my problem. As of partition size, I will
 start with some bucket assumption. If the partition size exceeds the
 threshold I may need to re-bucket using smaller bucket size.

 On another thread Eric mentions the optimal partition size should be at
 100 kb ~ 1 MB. I will use that as the start point to design my bucket
 strategy.


 On Sun, Dec 7, 2014 at 10:32 AM, Jack Krupansky j...@basetechnology.com
  wrote:

   It would be helpful to look at some specific examples of sequences,
 showing how they grow. I suspect that the term “sequence” is being
 overloaded in some subtly misleading way here.

 Besides, we’ve already answered the headline question – data locality
 is achieved by having a common partition key. So, we need some clarity as
 to what question we are really focusing on

 And, of course, we should be asking the “Cassandra Data Modeling 101”
 question of what do your queries want to look like, how exactly do you want
 to access your data. Only after we have a handle on how you need to read
 your data can we decide how it should be stored.

 My immediate question to get things back on track: When you say “The
 typical read is to load a subset of sequences with the same seq_id”,
 what type of “subset” are you talking about? Again, a few explicit and
 concise example queries (in some concise, easy to read pseudo language or
 even plain English, but not belabored with full CQL syntax.) would be very
 helpful. I mean, Cassandra has no “subset” concept, nor a “load subset”
 command, so what are we really talking about?

 Also, I presume we are talking CQL, but some of the references seem
 more Thrift/slice oriented.

 -- Jack Krupansky

  *From:* Eric Stevens migh...@gmail.com
 *Sent:* Sunday, December 7, 2014 10:12 AM
 *To:* user@cassandra.apache.org
 *Subject:* Re: How to model data to achieve specific data locality

  Also new seq_types can be added and old seq_types can

Re: How to model data to achieve specific data locality

2014-12-07 Thread Kai Wang

 as your clustering column, you can
 have many of them using the same table structure ...





 On Sat, Dec 6, 2014 at 10:09 PM, Kai Wang dep...@gmail.com wrote:

   On Sat, Dec 6, 2014 at 11:18 AM, Eric Stevens migh...@gmail.com
 wrote:

 It depends on the size of your data, but if your data is reasonably
 small, there should be no trouble including thousands of records on the
 same partition key.  So a data model using PRIMARY KEY ((seq_id), seq_type)
 ought to work fine.

 If the data size per partition exceeds some threshold that represents
 the right tradeoff of increasing repair cost, gc pressure, threatening
 unbalanced loads, and other issues that come with wide partitions, then you
 can subpartition via some means in a manner consistent with your work load,
 with something like PRIMARY KEY ((seq_id, subpartition), seq_type).

 For example, if seq_type can be processed for a given seq_id in any
 order, and you need to be able to locate specific records for a known
 seq_id/seq_type pair, you can compute subpartition is computed
 deterministically.  Or if you only ever need to read *all* values for
 a given seq_id, and the processing order is not important, just randomly
 generate a value for subpartition at write time, as long as you can know
 all possible values for subpartition.

 If the values for the seq_types for a given seq_id must always be
 processed in order based on seq_type, then your subpartition calculation
 would need to reflect that and place adjacent seq_types in the same
 partition.  As a contrived example, say seq_type was an incrementing
 integer, your subpartition could be seq_type / 100.

 On Fri Dec 05 2014 at 7:34:38 PM Kai Wang dep...@gmail.com wrote:

  I have a data model question. I am trying to figure out how to model
 the data to achieve the best data locality for analytic purpose. Our
 application processes sequences. Each sequence has a unique key in the
 format of [seq_id]_[seq_type]. For any given seq_id, there are unlimited
 number of seq_types. The typical read is to load a subset of sequences 
 with
 the same seq_id. Naturally I would like to have all the sequences with the
 same seq_id to co-locate on the same node(s).



 However I can't simply create one partition per seq_id and use seq_id
 as my partition key. That's because:



 1. there could be thousands or even more seq_types for each seq_id.
 It's not feasible to include all the seq_types into one table.

 2. each seq_id might have different sets of seq_types.

 3. each application only needs to access a subset of seq_types for a
 seq_id. Based on CASSANDRA-5762, select partial row loads the whole row. I
 prefer only touching the data that's needed.



 As per above, I think I should use one partition per
 [seq_id]_[seq_type]. But how can I archive the data locality on seq_id? 
 One
 possible approach is to override IPartitioner so that I just use part of
 the field (say 64 bytes) to get the token (for location) while still using
 the whole field as partition key (for look up). But before heading that
 direction, I would like to see if there are better options out there. 
 Maybe
 any new or upcoming features in C* 3.0?



 Thanks.


 Thanks, Eric.

 Those sequences are not fixed. All sequences with the same seq_id tend
 to grow at the same rate. If it's one partition per seq_id, the size will
 most likely exceed the threshold quickly. Also new seq_types can be added
 and old seq_types can be deleted. This means I often need to ALTER TABLE to
 add and drop columns. I am not sure if this is a good practice from
 operation point of view.

 I thought about your subpartition idea. If there are only a few
 applications and each one of them uses a subset of seq_types, I can easily
 create one table per application since I can compute the subpartition
 deterministically as you said. But in my case data scientists need to
 easily write new applications using any combination of seq_types of a
 seq_id. So I want the data model to be flexible enough to support
 applications using any different set of seq_types without creating new
 tables, duplicate all the data etc.

 -Kai

Re: How to model data to achieve specific data locality

2014-12-06 Thread Kai Wang

On Sat, Dec 6, 2014 at 11:18 AM, Eric Stevens migh...@gmail.com wrote:

 It depends on the size of your data, but if your data is reasonably small,
 there should be no trouble including thousands of records on the same
 partition key.  So a data model using PRIMARY KEY ((seq_id), seq_type)
 ought to work fine.

 If the data size per partition exceeds some threshold that represents the
 right tradeoff of increasing repair cost, gc pressure, threatening
 unbalanced loads, and other issues that come with wide partitions, then you
 can subpartition via some means in a manner consistent with your work load,
 with something like PRIMARY KEY ((seq_id, subpartition), seq_type).

 For example, if seq_type can be processed for a given seq_id in any order,
 and you need to be able to locate specific records for a known
 seq_id/seq_type pair, you can compute subpartition is computed
 deterministically.  Or if you only ever need to read *all* values for a
 given seq_id, and the processing order is not important, just randomly
 generate a value for subpartition at write time, as long as you can know
 all possible values for subpartition.

 If the values for the seq_types for a given seq_id must always be
 processed in order based on seq_type, then your subpartition calculation
 would need to reflect that and place adjacent seq_types in the same
 partition.  As a contrived example, say seq_type was an incrementing
 integer, your subpartition could be seq_type / 100.

 On Fri Dec 05 2014 at 7:34:38 PM Kai Wang dep...@gmail.com wrote:

 I have a data model question. I am trying to figure out how to model the
 data to achieve the best data locality for analytic purpose. Our
 application processes sequences. Each sequence has a unique key in the
 format of [seq_id]_[seq_type]. For any given seq_id, there are unlimited
 number of seq_types. The typical read is to load a subset of sequences with
 the same seq_id. Naturally I would like to have all the sequences with the
 same seq_id to co-locate on the same node(s).


 However I can't simply create one partition per seq_id and use seq_id as
 my partition key. That's because:


 1. there could be thousands or even more seq_types for each seq_id. It's
 not feasible to include all the seq_types into one table.

 2. each seq_id might have different sets of seq_types.

 3. each application only needs to access a subset of seq_types for a
 seq_id. Based on CASSANDRA-5762, select partial row loads the whole row. I
 prefer only touching the data that's needed.


 As per above, I think I should use one partition per [seq_id]_[seq_type].
 But how can I archive the data locality on seq_id? One possible approach is
 to override IPartitioner so that I just use part of the field (say 64
 bytes) to get the token (for location) while still using the whole field as
 partition key (for look up). But before heading that direction, I would
 like to see if there are better options out there. Maybe any new or
 upcoming features in C* 3.0?


 Thanks.


Thanks, Eric.

Those sequences are not fixed. All sequences with the same seq_id tend to
grow at the same rate. If it's one partition per seq_id, the size will most
likely exceed the threshold quickly. Also new seq_types can be added and
old seq_types can be deleted. This means I often need to ALTER TABLE to add
and drop columns. I am not sure if this is a good practice from operation
point of view.

I thought about your subpartition idea. If there are only a few
applications and each one of them uses a subset of seq_types, I can easily
create one table per application since I can compute the subpartition
deterministically as you said. But in my case data scientists need to
easily write new applications using any combination of seq_types of a
seq_id. So I want the data model to be flexible enough to support
applications using any different set of seq_types without creating new
tables, duplicate all the data etc.

-Kai

How to model data to achieve specific data locality

2014-12-05 Thread Kai Wang

I have a data model question. I am trying to figure out how to model the
data to achieve the best data locality for analytic purpose. Our
application processes sequences. Each sequence has a unique key in the
format of [seq_id]_[seq_type]. For any given seq_id, there are unlimited
number of seq_types. The typical read is to load a subset of sequences with
the same seq_id. Naturally I would like to have all the sequences with the
same seq_id to co-locate on the same node(s).


However I can't simply create one partition per seq_id and use seq_id as my
partition key. That's because:


1. there could be thousands or even more seq_types for each seq_id. It's
not feasible to include all the seq_types into one table.

2. each seq_id might have different sets of seq_types.

3. each application only needs to access a subset of seq_types for a
seq_id. Based on CASSANDRA-5762, select partial row loads the whole row. I
prefer only touching the data that's needed.


As per above, I think I should use one partition per [seq_id]_[seq_type].
But how can I archive the data locality on seq_id? One possible approach is
to override IPartitioner so that I just use part of the field (say 64
bytes) to get the token (for location) while still using the whole field as
partition key (for look up). But before heading that direction, I would
like to see if there are better options out there. Maybe any new or
upcoming features in C* 3.0?


Thanks.

Re: Keyspace and table/cf limits

2014-12-05 Thread Kai Wang

On Fri, Dec 5, 2014 at 4:32 PM, Robert Coli rc...@eventbrite.com wrote:

 On Wed, Dec 3, 2014 at 1:54 PM, Raj N raj.cassan...@gmail.com wrote:

 The question is more from a multi-tenancy point of view. We wanted to see
 if we can have a keyspace per client. Each keyspace may have 50 column
 families, but if we have 200 clients, that would be 10,000 column families.
 Do you think that's reasonable to support? I know that key cache capacity
 is reserved in heap still. Any plans to move it off-heap?


 That's an order of magnitude more CFs than I would want to try to operate.

 But then, I wouldn't want to operate Cassandra multi-tenant AT ALL, so
 grain of salt.

 =Rob
 http://twitter.com/rcolidba


I don't know if it's still true but Jonathan Ellis wrote in an old post
saying there's a fixed overhead per cf. Here is the link.
http://dba.stackexchange.com/a/12413. Even if it's improved since C* 1.0, I
still don't feel comfortable to scale my system by creating CFs.

1 2 >

100 matches

Mail list logo