Re: UDF for sorting

2017-07-03 Thread Justin Cameron
While you can't do this with Cassandra, you can get the functionality you
want with the cassandra-lucene-plugin (
https://github.com/Stratio/cassandra-lucene-index/blob/branch-3.0.10/doc/documentation.rst#searching
).

Keep in mind that as with any secondary index there are performance-related
limitations:
https://github.com/Stratio/cassandra-lucene-index/blob/branch-3.0.10/doc/documentation.rst#performance-tips


On Tue, 4 Jul 2017 at 07:17 DuyHai Doan  wrote:

> Plain answer is no you can't
>
> The reason is that UDF only transform column values on each row but does
> not have the ability to modify rows ordering
>
> On Mon, Jul 3, 2017 at 10:14 PM, techpyaasa . 
> wrote:
>
>> Hi all,
>>
>> I have a table like
>>
>> CREATE TABLE ks.cf ( pk1 bigint, cc1 bigint, disp_name text , stat_obj
>> text, status int, PRIMARY KEY (pk1, cc1)) WITH CLUSTERING ORDER BY (cc1 ASC)
>>
>> CREATE INDEX idx1 on ks.cf(status);
>>
>> I want to have a queries like
>> *select * from ks.cf  where pk1=123 and cc1=345;*
>>
>> and
>> *select * from ks.cf  where pk1=123 and status=1;*
>> In this case , I want rows to be sorted based on 'disp_name' (asc/desc) .
>>
>> Can I achieve the same using UDF or anything else ?? (Sorry If my
>> understanding about UDF is wrong).
>>
>> Thanks in advance
>> TechPyaasa
>>
>
> --


*Justin Cameron*Senior Software Engineer





This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
and Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.


Re: timeoutexceptions with UDF causing cassandra forceful exits

2017-07-03 Thread DuyHai Doan
Beside the config of user_function_timeout_policy, I would say having an
UDF that times out badly is generally an indication that you should review
your UDF code

On Mon, Jul 3, 2017 at 7:58 PM, Jeff Jirsa  wrote:

>
>
> On 2017-06-29 17:00 (-0700), Akhil Mehra  wrote:
> > By default user_function_timeout_policy is set to die i.e. warn and kill
> the JVM. Please find below a source code snippet that outlines possible
> setting.
>
> (Which also means you can set user_function_timeout_policy to ignore in
> your yaml and just log an error instead of exiting)
>
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: UDF for sorting

2017-07-03 Thread DuyHai Doan
Plain answer is no you can't

The reason is that UDF only transform column values on each row but does
not have the ability to modify rows ordering

On Mon, Jul 3, 2017 at 10:14 PM, techpyaasa .  wrote:

> Hi all,
>
> I have a table like
>
> CREATE TABLE ks.cf ( pk1 bigint, cc1 bigint, disp_name text , stat_obj
> text, status int, PRIMARY KEY (pk1, cc1)) WITH CLUSTERING ORDER BY (cc1 ASC)
>
> CREATE INDEX idx1 on ks.cf(status);
>
> I want to have a queries like
> *select * from ks.cf  where pk1=123 and cc1=345;*
>
> and
> *select * from ks.cf  where pk1=123 and status=1;*
> In this case , I want rows to be sorted based on 'disp_name' (asc/desc) .
>
> Can I achieve the same using UDF or anything else ?? (Sorry If my
> understanding about UDF is wrong).
>
> Thanks in advance
> TechPyaasa
>


UDF for sorting

2017-07-03 Thread techpyaasa .
Hi all,

I have a table like

CREATE TABLE ks.cf ( pk1 bigint, cc1 bigint, disp_name text , stat_obj
text, status int, PRIMARY KEY (pk1, cc1)) WITH CLUSTERING ORDER BY (cc1 ASC)

CREATE INDEX idx1 on ks.cf(status);

I want to have a queries like
*select * from ks.cf  where pk1=123 and cc1=345;*

and
*select * from ks.cf  where pk1=123 and status=1;*
In this case , I want rows to be sorted based on 'disp_name' (asc/desc) .

Can I achieve the same using UDF or anything else ?? (Sorry If my
understanding about UDF is wrong).

Thanks in advance
TechPyaasa


Re: False positive increasing

2017-07-03 Thread Ariel Weisberg
Jeff is probably correct. I skimmed over the fact that it's just
increasing by one every few minutes so I went on about a different
scenario.

On Mon, Jul 3, 2017, at 01:46 PM, Jeff Jirsa wrote:
> 
> 
> On 2017-07-03 06:55 (-0700), Jean Carlo 
> wrote: 
> > Hello
> > 
> > Lately I am observing that the false positives of one of my nodes are
> > increasing in a continous way (1 per 5min)
> > 
> 
> There's probably one partition that has a false positive entry, and you
> read it once every 5 minutes. Bloom filters are probabilistic, false
> positives are OK, it just causes a little bit of extra disk IO. 
> 
> > Bloom filter false positives: 532
> > Bloom filter false ratio: 0.01449
> > Bloom filter space used: 1.34 MB
> > Bloom filter off heap memory used: 1.33 MB
> > 
> > At the same time I can see that the duration of GC has increased also
> > 
> > There is a link between the increasment of the GC and the bloom filter ?
> > 
> 
> Probably not in any meaningful way (like mentioned above, false positive
> causes some extra disk IO to check one extra sstable, but it's not going
> to really impact GC in any meaningful way if it's truly a false
> positive).
> 
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



RE: Node failure Due To Very high GC pause time

2017-07-03 Thread ZAIDI, ASAD A
>>   here my is doubt is that does all the deleted 3.3Million row will be 
>> loaded in my on-heap memory? if not what will be object that occupying those 
>> memory ?

  It depends on your queries what data they’re fetching from your 
database.   Assuming you’re using CMS garbage collector and you’ve enabled GC 
logs with PrintGCDetails, PrintClassHistogramBeforeFullGC, 
PrintClassHistogramAfterFullGC – your logs should tell you what java classes 
occupies most of your  heap memory.

System.log file can also give you some clue  like if you see references to your 
tables with [tombstones],  A quick [grep –i tombstone /path/to/system.log] 
command would tell you what objects are suffering with tombstones!


From: Karthick V [mailto:karthick...@zohocorp.com]
Sent: Monday, July 03, 2017 11:47 AM
To: user 
Subject: Re: Node failure Due To Very high GC pause time

Hi Bryan,

Thanks for your quick response.  We have already tuned our memory 
and GC based on our hardware specification and it was working fine until 
yesterday, i.e before facing the below specified delete request. As you 
specified we will once again look into our GC & memory configuration.

FYKI :  We are using memtable_allocation_typ as offheap_objects.

Consider the following table

CREATE TABLE  EmployeeDetails (
branch_id text,
department_id  text,
emp_id bigint,
emp_details text,
PRIMARY KEY (branch, department, emp_id)
) WITH CLUSTERING ORDER BY (department ASC, emp_id ASC)


In this table I have 10 million records for the a particular branch_id and 
department_id . And following are the list of operation which I perform in C* 
in chronological order

  1.  Deleting 5 million records, from the start, in batches of 500 records per 
request for the particular branch_id (say 'xxx' ) and department_id (say 'yyy')
  2.  Read the next 500 records as soon the above delete operation is being 
completed ( Select * from EmployeeDetails where branch_id='xxx' and 
department_id = 'yyy' and emp_id >5000 limit 500 )

It's only after executing the above read request there was a spike in memory 
and within few minutes the node has been marked down.

So my question here is , will the above read request will load all the deleted 
5 million records in my memory before it starts fetching or will it jump 
directly to the offset of 5001 record (since we have specified the greater 
than condition) ? If its going to the former case then for sure the read 
request will keep the data in main memory and performs merge operation before 
it delivers the data as per this wiki( 
https://wiki.apache.org/cassandra/ReadPathForUsers
 ). If not let me know how the above specified read request will provide me the 
data .


Note : And also while analyzing my heap dump its clear that majority of the 
memory is being held my Tombstone threads.


Thanks in advance
-- karthick



 On Mon, 03 Jul 2017 20:40:10 +0530 Bryan Cheng 
> wrote 

This is a very antagonistic use case for Cassandra :P I assume you're familiar 
with Cassandra and deletes? (eg. 
http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html,
 
http://docs.datastax.com/en/cassandra/2.1/cassandra/dml/dml_about_deletes_c.html)

That being said, are you giving enough time for your tables to flush to disk? 
Deletes generate markers which can and will consume memory until they have a 
chance to be flushed, after which they will impact query time and performance 
(but should relieve memory pressure). If you're saturating the capability of 
your nodes your tables will have difficulty flushing. See 
http://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_memtable_thruput_c.html.

This could also be a heap/memory configuration issue as well or a GC tuning 
issue (although unlikely if you've left those at 

Re: jbod disk usage unequal

2017-07-03 Thread Jeff Jirsa


On 2017-06-29 06:55 (-0700), Micha  wrote: 
> Hi,
> 
> I use a jbod setup (2 * 1TB) and the distribution is a little bit
> unequal on my three nodes:
> 270MB and 540MB
> 150 and 580
> 290 and 500
> 
> SStable size varies between 2GB and 130GB.
> 

You're switching between MB and GB, are these all in GB? 

> Is is possible to move sstables from one disk to another to balance the
> disk usage?

Is there a reason you feel it's required, other than being bothered by the fact 
that they're not equal? 

> Otherwise is a raid-0 setup the only option for a balanced disk usage?

Probably for most versions of cassandra (newer versions will attempt to choose 
JBOD disk based on token ranges, which is designed to protect against some 
other edge case bugs where you lose one disk and have to rebuild the whole node 
anyway). The real question is "why do you care, really?" - it'll spill over to 
the other drive when it needs space, and if it's not near capacity, it doesn't 
REALLY matter how full it is, right?



-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: timeoutexceptions with UDF causing cassandra forceful exits

2017-07-03 Thread Jeff Jirsa


On 2017-06-29 17:00 (-0700), Akhil Mehra  wrote: 
> By default user_function_timeout_policy is set to die i.e. warn and kill the 
> JVM. Please find below a source code snippet that outlines possible setting.

(Which also means you can set user_function_timeout_policy to ignore in your 
yaml and just log an error instead of exiting)



-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: False positive increasing

2017-07-03 Thread Jeff Jirsa


On 2017-07-03 06:55 (-0700), Jean Carlo  wrote: 
> Hello
> 
> Lately I am observing that the false positives of one of my nodes are
> increasing in a continous way (1 per 5min)
> 

There's probably one partition that has a false positive entry, and you read it 
once every 5 minutes. Bloom filters are probabilistic, false positives are OK, 
it just causes a little bit of extra disk IO. 

> Bloom filter false positives: 532
> Bloom filter false ratio: 0.01449
> Bloom filter space used: 1.34 MB
> Bloom filter off heap memory used: 1.33 MB
> 
> At the same time I can see that the duration of GC has increased also
> 
> There is a link between the increasment of the GC and the bloom filter ?
> 

Probably not in any meaningful way (like mentioned above, false positive causes 
some extra disk IO to check one extra sstable, but it's not going to really 
impact GC in any meaningful way if it's truly a false positive).


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: False positive increasing

2017-07-03 Thread Ariel Weisberg
Hi,

The number of false positives may be increasing because more filters are
being consulted on each query. The number of filters consulted on each
query is a function of number of sstables consulted.
You may be seeing an increase in number of tables consulted if
compaction is falling behind. I'm not an expert on the operational
playbook for compaction falling behind, but you can change the
compaction throttle, disable gossip so compaction can catch up (this can
go wrong), add capacity by adding nodes.
If it's just one node you may also want to look into why that node
is a hot spot. Is there a single large partition that could be
causing issues?
Ariel

On Mon, Jul 3, 2017, at 09:55 AM, Jean Carlo wrote:
> Hello
> Lately I am observing that the false positives of one of my nodes are
> increasing in a continous way (1 per 5min)> 
> Bloom filter false positives: 532
> Bloom filter false ratio: 0.01449
> Bloom filter space used: 1.34 MB
> Bloom filter off heap memory used: 1.33 MB
> At the same time I can see that the duration of GC has increased also> 
> There is a link between the increasment of the GC and the bloom
> filter ?> 
> Jean Carlo
> 



Re: Node failure Due To Very high GC pause time

2017-07-03 Thread Karthick V
Hi Bryan,



Thanks for your quick response.  We have already tuned our memory 
and GC based on our hardware specification and it was working fine until 
yesterday, i.e before facing the below specified delete request. As you 
specified we will once again look into our GC  memory configuration. 



FYKI :  We are using memtable_allocation_typ as offheap_objects. 



Consider the following table 



CREATE TABLE  EmployeeDetails (

branch_id text,

department_id  text,

emp_id bigint,

emp_details text,

PRIMARY KEY (branch, department, emp_id)

) WITH CLUSTERING ORDER BY (department ASC, emp_id ASC)






In this table I have 10 million records for the a particular branch_id and 
department_id . And following are the list of operation which I perform in C* 
in chronological order

Deleting 5 million records, from the start, in batches of 500 records per 
request for the particular branch_id (say 'xxx' ) and department_id (say 'yyy')

Read the next 500 records as soon the above delete operation is being completed 
( Select * from EmployeeDetails where branch_id='xxx' and department_id = 'yyy' 
and emp_id 5000 limit 500 )



It's only after executing the above read request there was a spike in memory 
and within few minutes the node has been marked down.



So my question here is , will the above read request will load all the deleted 
5 million records in my memory before it starts fetching or will it jump 
directly to the offset of 5001 record (since we have specified the greater 
than condition) ? If its going to the former case then for sure the read 
request will keep the data in main memory and performs merge operation before 
it delivers the data as per this wiki( 
https://wiki.apache.org/cassandra/ReadPathForUsers ). If not let me know how 
the above specified read request will provide me the data .





Note : And also while analyzing my heap dump its clear that majority of the 
memory is being held my Tombstone threads.





Thanks in advance 

-- karthick

 







 On Mon, 03 Jul 2017 20:40:10 +0530 Bryan Cheng 
br...@blockcypher.com wrote 




This is a very antagonistic use case for Cassandra :P I assume you're familiar 
with Cassandra and deletes? (eg. 
http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html, 
http://docs.datastax.com/en/cassandra/2.1/cassandra/dml/dml_about_deletes_c.html)



That being said, are you giving enough time for your tables to flush to disk? 
Deletes generate markers which can and will consume memory until they have a 
chance to be flushed, after which they will impact query time and performance 
(but should relieve memory pressure). If you're saturating the capability of 
your nodes your tables will have difficulty flushing. See 
http://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_memtable_thruput_c.html.



This could also be a heap/memory configuration issue as well or a GC tuning 
issue (although unlikely if you've left those at default)



--Bryan






On Mon, Jul 3, 2017 at 7:51 AM, Karthick V karthick...@zohocorp.com 
wrote:








Hi,



  Recently In my test Cluster I faced a outrageous GC activity which made 
the Node unreachable inside the cluster itself.



Scenario : 

  In a Partition of 5Million rows we read first 500 (by giving the starting 
range) and delete the same 500 again.The same has been done recursively by 
changing the Start range alone. Initially I didn't see any difference in the 
query performance ( upto 50,000) but later I observed a significant increase in 
performance when reached about a 3.3Million the read request failed and the 
node went unreachable. After analysing my GC logs it is clear that 99% of my 
old-memory space is occupied and there are no more space for allocation it 
caused the machine stall.

   here my is doubt is that does all the deleted 3.3Million row will be 
loaded in my on-heap memory? if not what will be object that occupying those 
memory ?.   



PS : I am using C* 2.1.13 in cluster. 

















Re: Node failure Due To Very high GC pause time

2017-07-03 Thread Bryan Cheng
This is a very antagonistic use case for Cassandra :P I assume you're
familiar with Cassandra and deletes? (eg.
http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html,
http://docs.datastax.com/en/cassandra/2.1/cassandra/dml/dml_about_deletes_c.html
)

That being said, are you giving enough time for your tables to flush to
disk? Deletes generate markers which can and will consume memory until they
have a chance to be flushed, after which they will impact query time and
performance (but should relieve memory pressure). If you're saturating the
capability of your nodes your tables will have difficulty flushing. See
http://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_memtable_thruput_c.html
.

This could also be a heap/memory configuration issue as well or a GC tuning
issue (although unlikely if you've left those at default)

--Bryan


On Mon, Jul 3, 2017 at 7:51 AM, Karthick V  wrote:

> Hi,
>
>   Recently In my test Cluster I faced a outrageous GC activity which
> made the Node unreachable inside the cluster itself.
>
> Scenario :
>   In a Partition of 5Million rows we read first 500 (by giving the
> starting range) and delete the same 500 again.The same has been done
> recursively by changing the Start range alone. Initially I didn't see any
> difference in the query performance ( upto 50,000) but later I observed a
> significant increase in performance when reached about a 3.3Million the
> read request failed and the node went unreachable. After analysing my GC
> logs it is clear that 99% of my old-memory space is occupied and there are
> no more space for allocation it caused the machine stall.
>here my is doubt is that does all the deleted 3.3Million row will
> be loaded in my on-heap memory? if not what will be object that occupying
> those memory ?.
>
> PS : I am using C* 2.1.13 in cluster.
>
>
>
>
>


Node failure Due To Very high GC pause time

2017-07-03 Thread Karthick V
Hi,



  Recently In my test Cluster I faced a outrageous GC activity which made 
the Node unreachable inside the cluster itself.



Scenario : 

  In a Partition of 5Million rows we read first 500 (by giving the starting 
range) and delete the same 500 again.The same has been done recursively by 
changing the Start range alone. Initially I didn't see any difference in the 
query performance ( upto 50,000) but later I observed a significant increase in 
performance when reached about a 3.3Million the read request failed and the 
node went unreachable. After analysing my GC logs it is clear that 99% of my 
old-memory space is occupied and there are no more space for allocation it 
caused the machine stall.

   here my is doubt is that does all the deleted 3.3Million row will be 
loaded in my on-heap memory? if not what will be object that occupying those 
memory ?.   



PS : I am using C* 2.1.13 in cluster. 












False positive increasing

2017-07-03 Thread Jean Carlo
Hello

Lately I am observing that the false positives of one of my nodes are
increasing in a continous way (1 per 5min)

Bloom filter false positives: 532
Bloom filter false ratio: 0.01449
Bloom filter space used: 1.34 MB
Bloom filter off heap memory used: 1.33 MB

At the same time I can see that the duration of GC has increased also

There is a link between the increasment of the GC and the bloom filter ?

Jean Carlo