Re: Cassandra stress tool - data generation

2017-11-01 Thread Lucas Benevides
Hi Varun,

I apreciate you answer but this is not what is causing my problem.
Even if it is SEQ, as the excelent article by Ben Slater says, it will
always repeat the same sequential at each new operation (in my case one
operation equals to one partition).

But in that issue, I saw another one: https://issues.apache.
org/jira/browse/CASSANDRA-11138 that may be causing the problem. I will
apply this patch, test it and report it later.

Thank you
Lucas Benevides

2017-11-01 14:59 GMT-02:00 Varun Barala :

> https://www.instaclustr.com/deep-diving-cassandra-stress-
> part-3-using-yaml-profiles/ In this particular blog, they mentioned your
> case.
>
> Changed uniform() distribution to seq() distribution
> https://issues.apache.org/jira/browse/CASSANDRA-12490
>
> Thanks!!
>
>
> On Thu, Nov 2, 2017 at 12:54 AM, Varun Barala 
> wrote:
>
>> Hi,
>>
>> https://www.instaclustr.com/deep-diving-into-cassandra-stress-part-1/
>>
>> In the blog, They covered many things in detail.
>>
>> Thanks!!
>>
>> On Thu, Nov 2, 2017 at 12:38 AM, Lucas Benevides <
>> lu...@maurobenevides.com.br> wrote:
>>
>>> Dear community,
>>>
>>> I am using Cassandra Stress Tool and trying to simulate IoT generated
>>> data.
>>> So I created a column family with the device_id as the partition key.
>>>
>>> But in every different operation (the parameter received in the -n
>>> option) the generated values are the same. For instance, I have a column
>>> called observation_time which is supposed to be the time measured by the
>>> sensor. But in every partition the values are equal.
>>>
>>> Is there a way to make those values be randomically generated with
>>> different seeds? I need this way so that if the same device_id occurs
>>> again, it makes an INSERT instead of an UPSERT.
>>>
>>> To clarify: What is happening now (fictional data):
>>>
>>> operation 1
>>> device 1
>>> ts1: 01/01/1970
>>> ts2: 02/01/1980
>>> ts3: 03/01/1990
>>>
>>> operation 2
>>> device 2
>>> ts1: 01/01/1970
>>> ts2: 02/01/1980
>>> ts3: 03/01/1990
>>>
>>> What I want:
>>> operation1
>>> device 1
>>> ts1: 01/01/1970
>>> ts2: 02/01/1980
>>> ts3: 03/01/1990
>>>
>>> operation2
>>> device 2
>>> ts1: 02/01/1971  #Different values here.
>>> ts2: 05/01/1982
>>> ts3: 08/01/1993
>>>
>>> Thanks in advance,
>>> Lucas Benevides
>>>
>>>
>>
>


Re: Cassandra stress tool - data generation

2017-11-01 Thread Varun Barala
https://www.instaclustr.com/deep-diving-cassandra-stress-part-3-using-yaml-profiles/
In this particular blog, they mentioned your case.

Changed uniform() distribution to seq() distribution
https://issues.apache.org/jira/browse/CASSANDRA-12490

Thanks!!


On Thu, Nov 2, 2017 at 12:54 AM, Varun Barala 
wrote:

> Hi,
>
> https://www.instaclustr.com/deep-diving-into-cassandra-stress-part-1/
>
> In the blog, They covered many things in detail.
>
> Thanks!!
>
> On Thu, Nov 2, 2017 at 12:38 AM, Lucas Benevides <
> lu...@maurobenevides.com.br> wrote:
>
>> Dear community,
>>
>> I am using Cassandra Stress Tool and trying to simulate IoT generated
>> data.
>> So I created a column family with the device_id as the partition key.
>>
>> But in every different operation (the parameter received in the -n
>> option) the generated values are the same. For instance, I have a column
>> called observation_time which is supposed to be the time measured by the
>> sensor. But in every partition the values are equal.
>>
>> Is there a way to make those values be randomically generated with
>> different seeds? I need this way so that if the same device_id occurs
>> again, it makes an INSERT instead of an UPSERT.
>>
>> To clarify: What is happening now (fictional data):
>>
>> operation 1
>> device 1
>> ts1: 01/01/1970
>> ts2: 02/01/1980
>> ts3: 03/01/1990
>>
>> operation 2
>> device 2
>> ts1: 01/01/1970
>> ts2: 02/01/1980
>> ts3: 03/01/1990
>>
>> What I want:
>> operation1
>> device 1
>> ts1: 01/01/1970
>> ts2: 02/01/1980
>> ts3: 03/01/1990
>>
>> operation2
>> device 2
>> ts1: 02/01/1971  #Different values here.
>> ts2: 05/01/1982
>> ts3: 08/01/1993
>>
>> Thanks in advance,
>> Lucas Benevides
>>
>>
>


Re: Cassandra stress tool - data generation

2017-11-01 Thread Varun Barala
Hi,

https://www.instaclustr.com/deep-diving-into-cassandra-stress-part-1/

In the blog, They covered many things in detail.

Thanks!!

On Thu, Nov 2, 2017 at 12:38 AM, Lucas Benevides <
lu...@maurobenevides.com.br> wrote:

> Dear community,
>
> I am using Cassandra Stress Tool and trying to simulate IoT generated data.
> So I created a column family with the device_id as the partition key.
>
> But in every different operation (the parameter received in the -n option)
> the generated values are the same. For instance, I have a column called
> observation_time which is supposed to be the time measured by the sensor.
> But in every partition the values are equal.
>
> Is there a way to make those values be randomically generated with
> different seeds? I need this way so that if the same device_id occurs
> again, it makes an INSERT instead of an UPSERT.
>
> To clarify: What is happening now (fictional data):
>
> operation 1
> device 1
> ts1: 01/01/1970
> ts2: 02/01/1980
> ts3: 03/01/1990
>
> operation 2
> device 2
> ts1: 01/01/1970
> ts2: 02/01/1980
> ts3: 03/01/1990
>
> What I want:
> operation1
> device 1
> ts1: 01/01/1970
> ts2: 02/01/1980
> ts3: 03/01/1990
>
> operation2
> device 2
> ts1: 02/01/1971  #Different values here.
> ts2: 05/01/1982
> ts3: 08/01/1993
>
> Thanks in advance,
> Lucas Benevides
>
>