RE: rebuild constantly fails, 3.11

2017-08-10 Thread Bob Dourandish
I don't know if this is going to help or not but we had a problem disk (was
going bad) and would develop bad blocks, requiring raid repair. As nearly as
we could guess, that would cause "some sort of timing issue" for nodetool -
we were never able to duplicate the exact occurrence on demand. The clue
came when the same exact process worked on a different setup that did not
use the same disk subsystem. Once we isolated and replaced the degrading
disk, the problem went away. 

Beyond what Jeff suggested, you may want to look at any logs your raid
produces for potential clues. Also, if it is always failing when sending the
same file, you might want to review what is going on there.

I know this is fairly "general" response but I hope it gives you some ideas.

Good luck!

Bob

-Original Message-
From: Jeff Jirsa [mailto:jji...@apache.org] 
Sent: Thursday, August 10, 2017 7:26 PM
To: dev@cassandra.apache.org
Subject: Re: rebuild constantly fails, 3.11



On 2017-08-08 01:00 (-0700), Micha  wrote: 
> Hi,
> 
> it seems I'm not able to add add 3 node dc to a 3 node dc. After 
> starting the rebuild on a new node, nodetool netstats show it will 
> receive 1200 files from node-1 and 5000 from node-2. The stream from
> node-1 completes but the stream from node-2 allways fails, after 
> sending ca 4000 files.
> 
> After restarting the rebuild it again starts to send the 5000 files.
> The whole cluster is connected via one switch only , no firewall 
> between, the networks shows no errors.
> The machines have 8 cores, 32GB RAM and two 1TB discs as raid0.
> the logs show no errors. The size of the data is ca 1TB.

Is there anything in `dmesg` ?  System logs? Nothing? Is node2 running? Is
node3 running? 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



RE: Cassandra on RocksDB experiment result

2017-04-19 Thread Bob Dourandish
There is probably something I missed the message below.

In my experience with pluggable storage engines (in the MySQL world), the 
engine manages all storage that it "owns." The higher tiers in the architecture 
don't need to get involved unless multiple storage engines have to deal with 
compaction (or similar) issues over the entire database, e.g., every storage 
engine has read/write access to every piece of data, even if that data is owned 
by another storage engine.

I don't know enough about Cassandra internals to have an opinion as to whether 
or not the above scenario makes sense in the Cassandra context. But "sharing" 
(processes or data) between storage engines gets pretty hairy, easily deadlocky 
(!), even in something as relatively straightforward as MySQL. 

So this could be a way cool project and I'd love to get involved if it gets 
off the ground.

Bob


-Original Message-
From: DuyHai Doan [mailto:doanduy...@gmail.com] 
Sent: Wednesday, April 19, 2017 3:33 PM
To: dev@cassandra.apache.org
Subject: Re: Cassandra on RocksDB experiment result

"I have no clue what it would take to accomplish a pluggable storage engine, 
but I love this idea."

This was a long and old debate we had several times in the past. One of the 
difficulty of pluggable storage engine is that we need to manage the 
differences between the LSMT of native C* and RockDB engine for compaction, 
repair, streaming etc...

Right now all the compaction strategies share the assumption that the data 
structure and layout on disk is fixed. With pluggable storage engine, we need 
to special case each compaction strategy (or at least the Abstract class of 
compaction strategy) for each engine.

The current approach is one storage engine, many compaction strategies for 
different use-cases (TWCS for time series, LCS for heavy update...).

With pluggable storage engine, we'll have a matrix of storage engine x 
compaction strategies.

And not even mentioning the other operations to handle like streaming and 
repair.

Another question that arose is: will the storage engine be run in the same JVM 
as the C* server or is it a separate process ? For the later, we're opening the 
door to yet-another-distributed-system complexity. For instance, how the C* JVM 
will communicate with the storage engine process ?
How to handle failure, crash, resume etc ...

That being said, if we manage to get the code base to this stage eventually 
it'd be super cool !

On Wed, Apr 19, 2017 at 12:03 PM, Salih Gedik  wrote:

> Hi Dikang,
>
> I guess there is something wrong with the link that you shared.
>
>
> 19.04.2017 19:21 tarihinde Dikang Gu yazdı:
>
> Hi Cassandra developers,
>>
>> This is Dikang from Instagram, I'd like to share you some experiment 
>> results we did recently, to use RocksDB as Cassandra's storage 
>> engine. In the experiment, I built a prototype to integrate Cassandra 
>> 3.0.12 and RocksDB on single column (key-value) use case, shadowed 
>> one of our production use case, and saw about 4-6X P99 read latency 
>> drop during peak time, compared to 3.0.12. Also, the P99 latency 
>> became more predictable as well.
>>
>> Here is detailed note with more metrics:
>>
>> https://docs.google.com/document/d/1Ztqcu8Jzh4USKoWBgDJQw82DBurQm
>> sV-PmfiJYvu_Dc/edit?usp=sharing
>>
>> Please take a look and let me know your thoughts. I think the biggest 
>> latency win comes from we get rid of most Java garbages created by 
>> current read/write path and compactions, which reduces the JVM 
>> overhead and makes the latency to be more predictable.
>>
>> We are very excited about the potential performance gain. As the next 
>> step, I propose to make the Cassandra storage engine to be pluggable 
>> (like Mysql and MongoDB), and we are very interested in providing 
>> RocksDB as one storage option with more predictable performance, 
>> together with community.
>>
>> Thanks.
>>
>>
>



RE: DataStax Client List

2017-03-23 Thread Bob Dourandish
Daemeon:

Please have this conversation DIRECTLY with the spammer, instead of replying to 
the entire list. I don't need to waste any more time on this, nor do I care to 
know the details of your transaction with these people.

Bob

-Original Message-
From: daemeon reiydelle [mailto:daeme...@gmail.com] 
Sent: Thursday, March 23, 2017 1:46 PM
To: dev@cassandra.apache.org
Subject: Re: DataStax Client List

Hi Theresa,

While some may be fussing at this, I am not concerned.

I AM interested in something of the sort, which would be a list of contacts who 
are CTO, CIO, etc. using big data. Just Hadoop (Datastax) is fine, or those 
using other big data providers would be of interest.

What are you looking to charge?

FYI, my goal is to get connected with the resources that provide CIO/CTO level 
headhunting to these CIO/CTO's. Thoughts?


*...*



*Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872*

On Thu, Mar 23, 2017 at 9:52 AM, Edward Capriolo 
wrote:

> Well that is quite unsettling.
>
> On Thu, Mar 23, 2017 at 10:33 AM, Theresa Taylor < 
> theresa.tay...@onlinedatatech.biz> wrote:
>
> > Hi,
> >
> > Would you be interested in acquiring a list of DataStax users'
> information
> > in an Excel sheet for unlimited marketing usage?
> >
> > List includes – First and Last name, Phone number, Email Address, 
> > Company Name, Job Title, Address, City, State, Zip, SIC 
> > code/Industry, Revenue
> and
> > Company Size. The leads can also be further customized as per
> requirements.
> >
> > We can provide contact lists from any country/industry/title.
> >
> > If your target criteria are different kindly get back to us with 
> > your requirement with geography and job titles to provide you with 
> > counts and more information.
> >
> > Let me know your thoughts!
> >
> > Thanks,
> >
> >
> > Theresa
> > Senior Information Analyst
> >
> >
> > If you wish not to receive marketing emails, please reply 
> > back
> “Opt
> > Out” In headlines
> >
>