Re: SELECT JSON timestamp lacks timezone information

2016-02-08 Thread Ralf Steppacher
Hi Alexandre.

I wrote to ‘user@cassandra.apache.org’.

Re the actual problem: I am aware of the fact that C* does not store (need not 
store) the timezone as it is persisted as a Unix epoche timestamp. Not 
delivering a timezone in the JSON text representation would be OKish if the 
text representation would be guaranteed to be in UTC. But it is not. It is in 
some timezone determined by the locale of the server side or that of the client 
VM. That way it is a pain in two ways as 

a) I have to add the timezone in a post-processing step to all timestamps in my 
JSON responses and 
b) I also have to do some guesswork at what the actual timezone might be

If there is no way to control the formatting of JSON timestamps and to add the 
time zone information, then IMHO that is bug. Is it not? Or am I missing 
something here?


Thanks!
Ralf


> On 08.02.2016, at 12:06, Alexandre Dutra  wrote:
> 
> Hello Ralf,
> 
> First of all, Cassandra stores timestamps without timezone information, so 
> it's not possible to retrieve the original timezone used when inserting the 
> value.
> 
> CQLSH uses the python driver behind the scenes, and my guess is that the 
> timestamp formatting is being done driver-side – hence the timezone – while 
> when you call toJson(), the formatting has to be done server-side.
> 
> That said, it does seem that Cassandra is using a format without timezone 
> when converting timestamps to JSON format:
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/serializers/TimestampSerializer.java#L52
>  
> 
> 
> I agree with you that a date format that would include the timezone would be 
> preferable here, but that is a question you should ask in the Cassandra Users 
> mailing list instead.
> 
> Hope that helps,
> 
> Alexandre
> 
> 
> 
> 
> On Mon, Feb 8, 2016 at 11:09 AM Ralf Steppacher  > wrote:
> Hello all,
> 
> When I select a timestamp as JSON from Cassandra, the string representation 
> lacks the timezone information, both via CQLSH and the Java Driver:
> 
> cqlsh:events> select toJson(created_at) AS created_at from 
> event_by_patient_timestamp ;
> 
>  created_at
> ---
>  "2016-01-04 16:05:47.123"
> 
> (1 rows)
> 
> vs.
> 
> cqlsh:events> select created_at FROM event_by_user_timestamp ;
> 
>  created_at
> --
>  2016-01-04 15:05:47+
> 
> (1 rows)
> cqlsh:events>
> 
> To make things even more complicated the JSON timestamp is not returned in 
> UTC. Is there a way to either tell the driver/C* to return the JSON date in 
> UTC or add the timezone information (much preferred) to the text 
> representation of the timestamp?
> 
> 
> Thanks!
> Ralf
> -- 
> Alexandre Dutra
> Driver & Tools Engineer @ DataStax



SELECT JSON timestamp lacks timezone information

2016-02-08 Thread Ralf Steppacher
Hello all,

When I select a timestamp as JSON from Cassandra, the string representation 
lacks the timezone information, both via CQLSH and the Java Driver:

cqlsh:events> select toJson(created_at) AS created_at from 
event_by_patient_timestamp ;

 created_at
---
 "2016-01-04 16:05:47.123"

(1 rows)

vs.

cqlsh:events> select created_at FROM event_by_user_timestamp ;

 created_at
--
 2016-01-04 15:05:47+

(1 rows)
cqlsh:events>

To make things even more complicated the JSON timestamp is not returned in UTC. 
Is there a way to either tell the driver/C* to return the JSON date in UTC or 
add the timezone information (much preferred) to the text representation of the 
timestamp?


Thanks!
Ralf

Re: SELECT JSON timestamp lacks timezone information

2016-02-08 Thread Alexandre Dutra
Hello Ralf,

First of all, Cassandra stores timestamps without timezone information, so
it's not possible to retrieve the original timezone used when inserting the
value.

CQLSH uses the python driver behind the scenes, and my guess is that the
timestamp formatting is being done driver-side – hence the timezone – while
when you call toJson(), the formatting has to be done server-side.

That said, it does seem that Cassandra is using a format without timezone
when converting timestamps to JSON format:
https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/serializers/TimestampSerializer.java#L52

I agree with you that a date format that would include the timezone would
be preferable here, but that is a question you should ask in the Cassandra
Users mailing list instead.

Hope that helps,

Alexandre




On Mon, Feb 8, 2016 at 11:09 AM Ralf Steppacher 
wrote:

> Hello all,
>
> When I select a timestamp as JSON from Cassandra, the string
> representation lacks the timezone information, both via CQLSH and the Java
> Driver:
>
> cqlsh:events> select toJson(created_at) AS created_at from
> event_by_patient_timestamp ;
>
>  created_at
> ---
>  "2016-01-04 16:05:47.123"
>
> (1 rows)
>
> vs.
>
> cqlsh:events> select created_at FROM event_by_user_timestamp ;
>
>  created_at
> --
>  2016-01-04 15:05:47+
>
> (1 rows)
> cqlsh:events>
>
> To make things even more complicated the JSON timestamp is not returned in
> UTC. Is there a way to either tell the driver/C* to return the JSON date in
> UTC or add the timezone information (much preferred) to the text
> representation of the timestamp?
>
>
> Thanks!
> Ralf

-- 
Alexandre Dutra
Driver & Tools Engineer @ DataStax


Re: CASSANDRA-8072

2016-02-08 Thread Stefania Alborghetti
Have you checked that you can telnet from one node to the other using the
same ip and the internode port?

I would put the public IP addresses of the seeds in the seed list and set
the listen address to the public IP address for each node.

There was a similar discussion

recently that might help.

On Tue, Feb 9, 2016 at 8:48 AM, Ted Yu  wrote:

> Thanks for the help, Stefania.
> By using "127.0.0.1" , I was able to start Cassandra on that seed node
> (XX.YY).
> However, on other nodes, I pointed seed to XX.YY and observed the
> following ?
> What did I miss ?
>
>
> INFO  [main] 2016-02-08 16:44:56,607  OutboundTcpConnection.java:97 -
> OutboundTcpConnection using coalescing strategy DISABLED
> ERROR [main] 2016-02-08 16:45:27,626  CassandraDaemon.java:581 - Exception
> encountered during startup
> java.lang.RuntimeException: Unable to gossip with any seeds
> at
> org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1337)
> ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at
> org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:541)
> ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at
> org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:789)
> ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:721)
> ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:612)
> ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:389)
> ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at com.datastax.bdp.server.DseDaemon.setup(DseDaemon.java:335)
> ~[dse-core-4.8.4.jar:4.8.4]
> at
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:564)
> ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at com.datastax.bdp.DseModule.main(DseModule.java:74)
> [dse-core-4.8.4.jar:4.8.4]
> INFO  [Thread-2] 2016-02-08 16:45:27,629  DseDaemon.java:418 - DSE
> shutting down...
>
> On Mon, Feb 8, 2016 at 4:25 PM, Stefania Alborghetti <
> stefania.alborghe...@datastax.com> wrote:
>
>> CASSANDRA-8072 is not going to help you because the code that fails
>> (checkForEndpointCollision()) should not execute for seeds.
>>
>> I think the problem is that there are no seeds in cassandra.yaml:
>>
>> - seeds: "XX.YY"
>>
>> If listen_address is localhost then try:
>>
>> - seeds: "127.0.0.1"
>>
>>
>> On Tue, Feb 9, 2016 at 5:58 AM, Ted Yu  wrote:
>>
>>> If I apply the fix from CASSANDRA-8072 onto a 2.1.12 cluster, which
>>> files should I replace ?
>>>
>>> Thanks
>>>
>>> On Mon, Feb 8, 2016 at 1:07 PM, Bhuvan Rawal 
>>> wrote:
>>>
 Your config looks fine to me,  i tried reproducing the scenario by
 setting localhost in listen_address,rpc_address and seed list, and it
 worked fine, I had earlier the node local ip in the 3 fields and it was
 working fine.

 Looks like there is some other issue here.

 On Tue, Feb 9, 2016 at 12:49 AM, Ted Yu  wrote:

> Here it is:
> http://pastebin.com/QEdjtAj6
>
> XX.YY is localhost in this case.
>
> On Mon, Feb 8, 2016 at 11:03 AM, Bhuvan Rawal 
> wrote:
>
>> could you paste your cassandra.yaml here, except for commented out
>> lines?
>>
>> On Tue, Feb 9, 2016 at 12:30 AM, Ted Yu  wrote:
>>
>>> The issue I described was observed on the seed node.
>>>
>>> Both rpc_address and listen_address point to localhost.
>>>
>>> bq. What addresses are there in the seed list?
>>>
>>> The IP of the seed node.
>>>
>>> I haven't come to starting non-seed node(s) yet.
>>>
>>> Thanks for the quick response.
>>>
>>> On Mon, Feb 8, 2016 at 10:50 AM, Bhuvan Rawal 
>>> wrote:
>>>
 Hi Ted,

 Have you specified the listen_address and rpc_address? What
 addresses are there in the seed list?

 Have you started seed first and after waiting for 30 seconds
 started other nodes?


 On Tue, Feb 9, 2016 at 12:14 AM, Ted Yu 
 wrote:

> Hi,
> I am trying to setup a cluster with DSE 4.8.4
>
> I added the following in resources/cassandra/conf/cassandra.yaml :
>
> cluster_name: 'cass'
>
> which resulted in:
>
> http://pastebin.com/27adxKTM
>
> This seems to be resolved by CASSANDRA-8072
>
> My question is whether there is 

Re: Writing a large blob returns WriteTimeoutException

2016-02-08 Thread Giampaolo Trapasso
"If not exists" was an oversight of a previous test. Removing it solved the
problem. Thanks a lot, Jim!

giampaolo

2016-02-09 1:21 GMT+01:00 Jim Ancona :

> The "if not exists" in your INSERT means that you are incurring a
> performance hit by using Paxos. Do you need that? Have you tried your test
> without  it?
>
> Jim
>


Re: [RELEASE] Apache Cassandra 2.1.13 released

2016-02-08 Thread Jake Luciani
Apologies I send the wrong changelog and news links.

Here are the correct ones for 2.1.13

http://goo.gl/9ZPnNX (CHANGES.txt)
http://goo.gl/5cR7eh (NEWS.txt)



On Mon, Feb 8, 2016 at 9:19 AM, Jake Luciani  wrote:

> The Cassandra team is pleased to announce the release of Apache Cassandra
> version 2.1.13.
>
> Apache Cassandra is a fully distributed database. It is the right choice
> when you need scalability and high availability without compromising
> performance.
>
>  http://cassandra.apache.org/
>
> Downloads of source and binary distributions are listed in our download
> section:
>
>  http://cassandra.apache.org/download/
>
> This version is a bug fix release[1] on the 2.1 series. As always, please
> pay
> attention to the release notes[2] and Let us know[3] if you were to
> encounter
> any problem.
>
> Enjoy!
>
> [1]: http://goo.gl/lT2JXJ (CHANGES.txt)
> [2]: http://goo.gl/9m6hGQ (NEWS.txt)
> [3]: https://issues.apache.org/jira/browse/CASSANDRA
>
>


[RELEASE] Apache Cassandra 2.1.13 released

2016-02-08 Thread Jake Luciani
The Cassandra team is pleased to announce the release of Apache Cassandra
version 2.1.13.

Apache Cassandra is a fully distributed database. It is the right choice
when you need scalability and high availability without compromising
performance.

 http://cassandra.apache.org/

Downloads of source and binary distributions are listed in our download
section:

 http://cassandra.apache.org/download/

This version is a bug fix release[1] on the 2.1 series. As always, please
pay
attention to the release notes[2] and Let us know[3] if you were to
encounter
any problem.

Enjoy!

[1]: http://goo.gl/lT2JXJ (CHANGES.txt)
[2]: http://goo.gl/9m6hGQ (NEWS.txt)
[3]: https://issues.apache.org/jira/browse/CASSANDRA


specifying listen_address

2016-02-08 Thread Ted Yu
Hi,
I downloaded and expanded DSE 4.8.4

When I specify the following in resources/dse/conf/dse.yaml :

listen_address: XX.YY

I got:
INFO  17:43:10  Loading settings from
file:/home/cassandra/dse-4.8.4/resources/dse/conf/dse.yaml
Exception in thread "main" java.lang.ExceptionInInitializerError
at com.datastax.bdp.DseCoreModule.(DseCoreModule.java:43)
at com.datastax.bdp.DseModule.getRequiredModules(DseModule.java:97)
at
com.datastax.bdp.server.AbstractDseModule.configure(AbstractDseModule.java:26)
...
Caused by: org.yaml.snakeyaml.error.YAMLException: Unable to find property
'listen_address' on class: com.datastax.bdp.config.Config
at
org.yaml.snakeyaml.introspector.PropertyUtils.getProperty(PropertyUtils.java:132)
at
org.yaml.snakeyaml.introspector.PropertyUtils.getProperty(PropertyUtils.java:121)

Some hint is appreciated.

If this is not the proper mailing list, please direct me to proper one.

Thanks


CASSANDRA-8072

2016-02-08 Thread Ted Yu
Hi,
I am trying to setup a cluster with DSE 4.8.4

I added the following in resources/cassandra/conf/cassandra.yaml :

cluster_name: 'cass'

which resulted in:

http://pastebin.com/27adxKTM

This seems to be resolved by CASSANDRA-8072

My question is whether there is workaround ?
If not, when can I expect 2.1.13 release ?

Thanks


Re: CASSANDRA-8072

2016-02-08 Thread Bhuvan Rawal
Hi Ted,

Have you specified the listen_address and rpc_address? What addresses are
there in the seed list?

Have you started seed first and after waiting for 30 seconds started other
nodes?


On Tue, Feb 9, 2016 at 12:14 AM, Ted Yu  wrote:

> Hi,
> I am trying to setup a cluster with DSE 4.8.4
>
> I added the following in resources/cassandra/conf/cassandra.yaml :
>
> cluster_name: 'cass'
>
> which resulted in:
>
> http://pastebin.com/27adxKTM
>
> This seems to be resolved by CASSANDRA-8072
>
> My question is whether there is workaround ?
> If not, when can I expect 2.1.13 release ?
>
> Thanks
>


Writing a large blob returns WriteTimeoutException

2016-02-08 Thread Giampaolo Trapasso
Hi to all,

I'm trying to put a large binary file (> 500MB) on a C* cluster as fast as
I can but I get some (many) WriteTimeoutExceptions.

I created a small POC that isolates the problem I'm facing. Here you will
find the code: https://github.com/giampaolotrapasso/cassandratest,

*Main details about it:*

   - I try to write the file into chunks (*data* field) <= 1MB (1MB is
   recommended max size for single cell),


   - Chunks are grouped into buckets. Every bucket is a partition row,
   - Buckets are grouped by UUIDs.


   - Chunk size and bucket size are configurable from app so I can try
   different configurations and see what happens.


   - Trying to max throughput, I execute asynch insertions, however to
   avoid too much pressure on the db, after a threshold, I wait at least for a
   finished insert to add another (this part is quite raw in my code but I
   think it's not so important). Also this parameter is configurable to test
   different combinations.

This is the table on db:

CREATE TABLE blobtest.store (
uuid uuid,
bucket bigint,
start bigint,
data blob,
end bigint,
PRIMARY KEY ((uuid, bucket), start)
)

and this is the main code (Scala, but I hope is be generally readable)

val statement = client.session.prepare("INSERT INTO
blobTest.store(uuid, bucket, start, end, data) VALUES (?, ?, ?, ?, ?) if
not exists;")

val blob = new Array[Byte](MyConfig.blobSize)
scala.util.Random.nextBytes(blob)

write(client,
  numberOfRecords = MyConfig.recordNumber,
  bucketSize = MyConfig.bucketSize,
  maxConcurrentWrites = MyConfig.maxFutures,
  blob,
  statement)

where write is

def write(database: Database, numberOfRecords: Int, bucketSize: Int,
maxConcurrentWrites: Int,
blob: Array[Byte], statement: PreparedStatement): Unit = {

val uuid: UUID = UUID.randomUUID()
var count = 0;

//Javish loop
while (count < numberOfRecords) {
  val record = Record(
uuid = uuid,
bucket = count / bucketSize,
start = ((count % bucketSize)) * blob.length,
end = ((count % bucketSize) + 1) * blob.length,
bytes = blob
  )
  asynchWrite(database, maxConcurrentWrites, statement, record)
  count += 1
}

waitDbWrites()
  }

and asynchWrite is just binding to statement

*Problem*

The problem is that when I try to increase the chunck size, or the number
of asynch insert or the size of the bucket (ie number of chuncks), app
become unstable since the db starts throwing WriteTimeoutException.

I've tested the stuff on the CCM (4 nodes) and on a EC2 cluster (5 nodes,
8GB Heap). Problem seems the same on both enviroments.

On my local cluster, I've tried to change respect to default configuration:

concurrent_writes: 128

write_request_timeout_in_ms: 20

other configurations are here:
https://gist.github.com/giampaolotrapasso/ca21a83befd339075e07

*Other*

Exceptions seems random, sometimes are at the beginning of the write

*Questions:*

1. Is my model wrong? Am I missing some important detail?

2. What are the important information to look at for this kind of problem?

3. Why exceptions are so random?

4. There is some other C* parameter I can set to assure that
WriteTimeoutException does not occur?

I hope I provided enough information to get some help.

Thank you in advance for any reply.


Giampaolo


Re: specifying listen_address

2016-02-08 Thread Bhuvan Rawal
Hi Ted,

Are you sure the path to yaml is correct?
For me(DSE 4.8.4) it is /etc/dse/cassandra/cassandra.yaml

On Mon, Feb 8, 2016 at 11:22 PM, Ted Yu  wrote:

> Hi,
> I downloaded and expanded DSE 4.8.4
>
> When I specify the following in resources/dse/conf/dse.yaml :
>
> listen_address: XX.YY
>
> I got:
> INFO  17:43:10  Loading settings from
> file:/home/cassandra/dse-4.8.4/resources/dse/conf/dse.yaml
> Exception in thread "main" java.lang.ExceptionInInitializerError
> at com.datastax.bdp.DseCoreModule.(DseCoreModule.java:43)
> at com.datastax.bdp.DseModule.getRequiredModules(DseModule.java:97)
> at
> com.datastax.bdp.server.AbstractDseModule.configure(AbstractDseModule.java:26)
> ...
> Caused by: org.yaml.snakeyaml.error.YAMLException: Unable to find property
> 'listen_address' on class: com.datastax.bdp.config.Config
> at
> org.yaml.snakeyaml.introspector.PropertyUtils.getProperty(PropertyUtils.java:132)
> at
> org.yaml.snakeyaml.introspector.PropertyUtils.getProperty(PropertyUtils.java:121)
>
> Some hint is appreciated.
>
> If this is not the proper mailing list, please direct me to proper one.
>
> Thanks
>
>


Re: specifying listen_address

2016-02-08 Thread Ted Yu
I didn't start cassandra as service.

I am starting as stand-alone process. Is multiple node setup not supported
in stand-alone mode ?

Caused by: org.yaml.snakeyaml.error.YAMLException: Unable to find property
'cluster_name' on class: com.datastax.bdp.config.Config
at
org.yaml.snakeyaml.introspector.PropertyUtils.getProperty(PropertyUtils.java:132)
at
org.yaml.snakeyaml.introspector.PropertyUtils.getProperty(PropertyUtils.java:121)

Thanks

On Mon, Feb 8, 2016 at 10:04 AM, Bhuvan Rawal  wrote:

> Hi Ted,
>
> Are you sure the path to yaml is correct?
> For me(DSE 4.8.4) it is /etc/dse/cassandra/cassandra.yaml
>
> On Mon, Feb 8, 2016 at 11:22 PM, Ted Yu  wrote:
>
>> Hi,
>> I downloaded and expanded DSE 4.8.4
>>
>> When I specify the following in resources/dse/conf/dse.yaml :
>>
>> listen_address: XX.YY
>>
>> I got:
>> INFO  17:43:10  Loading settings from
>> file:/home/cassandra/dse-4.8.4/resources/dse/conf/dse.yaml
>> Exception in thread "main" java.lang.ExceptionInInitializerError
>> at com.datastax.bdp.DseCoreModule.(DseCoreModule.java:43)
>> at com.datastax.bdp.DseModule.getRequiredModules(DseModule.java:97)
>> at
>> com.datastax.bdp.server.AbstractDseModule.configure(AbstractDseModule.java:26)
>> ...
>> Caused by: org.yaml.snakeyaml.error.YAMLException: Unable to find
>> property 'listen_address' on class: com.datastax.bdp.config.Config
>> at
>> org.yaml.snakeyaml.introspector.PropertyUtils.getProperty(PropertyUtils.java:132)
>> at
>> org.yaml.snakeyaml.introspector.PropertyUtils.getProperty(PropertyUtils.java:121)
>>
>> Some hint is appreciated.
>>
>> If this is not the proper mailing list, please direct me to proper one.
>>
>> Thanks
>>
>>
>


Re: specifying listen_address

2016-02-08 Thread Bhuvan Rawal
In either case, these properties should be placed in cassandra.yaml file
rather than dse.yaml.

You can find it in /resources/cassandra/conf directory.

On Mon, Feb 8, 2016 at 11:41 PM, Ted Yu  wrote:

> I didn't start cassandra as service.
>
> I am starting as stand-alone process. Is multiple node setup not supported
> in stand-alone mode ?
>
> Caused by: org.yaml.snakeyaml.error.YAMLException: Unable to find property
> 'cluster_name' on class: com.datastax.bdp.config.Config
> at
> org.yaml.snakeyaml.introspector.PropertyUtils.getProperty(PropertyUtils.java:132)
> at
> org.yaml.snakeyaml.introspector.PropertyUtils.getProperty(PropertyUtils.java:121)
>
> Thanks
>
> On Mon, Feb 8, 2016 at 10:04 AM, Bhuvan Rawal  wrote:
>
>> Hi Ted,
>>
>> Are you sure the path to yaml is correct?
>> For me(DSE 4.8.4) it is /etc/dse/cassandra/cassandra.yaml
>>
>> On Mon, Feb 8, 2016 at 11:22 PM, Ted Yu  wrote:
>>
>>> Hi,
>>> I downloaded and expanded DSE 4.8.4
>>>
>>> When I specify the following in resources/dse/conf/dse.yaml :
>>>
>>> listen_address: XX.YY
>>>
>>> I got:
>>> INFO  17:43:10  Loading settings from
>>> file:/home/cassandra/dse-4.8.4/resources/dse/conf/dse.yaml
>>> Exception in thread "main" java.lang.ExceptionInInitializerError
>>> at com.datastax.bdp.DseCoreModule.(DseCoreModule.java:43)
>>> at com.datastax.bdp.DseModule.getRequiredModules(DseModule.java:97)
>>> at
>>> com.datastax.bdp.server.AbstractDseModule.configure(AbstractDseModule.java:26)
>>> ...
>>> Caused by: org.yaml.snakeyaml.error.YAMLException: Unable to find
>>> property 'listen_address' on class: com.datastax.bdp.config.Config
>>> at
>>> org.yaml.snakeyaml.introspector.PropertyUtils.getProperty(PropertyUtils.java:132)
>>> at
>>> org.yaml.snakeyaml.introspector.PropertyUtils.getProperty(PropertyUtils.java:121)
>>>
>>> Some hint is appreciated.
>>>
>>> If this is not the proper mailing list, please direct me to proper one.
>>>
>>> Thanks
>>>
>>>
>>
>


Re: Can we set TTL on individual fields (columns) using the Datastax java-driver

2016-02-08 Thread DuyHai Doan
I think you should direct your request to the java driver mailing list:
https://groups.google.com/a/lists.datastax.com/forum/#!forum/java-driver-user

To answer your question, no, there is no @Ttl annotation on the
driver-mapping module, even in the latest release:
https://github.com/datastax/java-driver/tree/3.0/driver-mapping/src/main/java/com/datastax/driver/mapping/annotations

You'll need to handle the insertion with TTL yourself or look at other
object mappers


On Mon, Feb 8, 2016 at 8:27 PM, Ajay Garg  wrote:

> Something like ::
>
>
> ##
> class A {
>
>   @Id
>   @Column (name = "pojo_key")
>   int key;
>
>   @Ttl(10)
>   @Column (name = "pojo_temporary_guest")
>   String guest;
>
> }
> ##
>
>
> When I persist, let's say value "ajay" in guest-field
> (pojo_temporary_guest column), it stays forever, and does not become "null"
> after 10 seconds.
>
> Kindly point me what I am doing wrong.
> I will be grateful.
>
>
> Thanks and Regards,
> Ajay
>


Latest stable release

2016-02-08 Thread Ravi Krishna
We are starting a new project in Cassandra. Is 3.2 stable enough to be used
in production. If not, which is the most stable version in 2.x.

thanks.


Re: CASSANDRA-8072

2016-02-08 Thread Bhuvan Rawal
Your config looks fine to me,  i tried reproducing the scenario by setting
localhost in listen_address,rpc_address and seed list, and it worked fine,
I had earlier the node local ip in the 3 fields and it was working fine.

Looks like there is some other issue here.

On Tue, Feb 9, 2016 at 12:49 AM, Ted Yu  wrote:

> Here it is:
> http://pastebin.com/QEdjtAj6
>
> XX.YY is localhost in this case.
>
> On Mon, Feb 8, 2016 at 11:03 AM, Bhuvan Rawal  wrote:
>
>> could you paste your cassandra.yaml here, except for commented out lines?
>>
>> On Tue, Feb 9, 2016 at 12:30 AM, Ted Yu  wrote:
>>
>>> The issue I described was observed on the seed node.
>>>
>>> Both rpc_address and listen_address point to localhost.
>>>
>>> bq. What addresses are there in the seed list?
>>>
>>> The IP of the seed node.
>>>
>>> I haven't come to starting non-seed node(s) yet.
>>>
>>> Thanks for the quick response.
>>>
>>> On Mon, Feb 8, 2016 at 10:50 AM, Bhuvan Rawal 
>>> wrote:
>>>
 Hi Ted,

 Have you specified the listen_address and rpc_address? What addresses
 are there in the seed list?

 Have you started seed first and after waiting for 30 seconds started
 other nodes?


 On Tue, Feb 9, 2016 at 12:14 AM, Ted Yu  wrote:

> Hi,
> I am trying to setup a cluster with DSE 4.8.4
>
> I added the following in resources/cassandra/conf/cassandra.yaml :
>
> cluster_name: 'cass'
>
> which resulted in:
>
> http://pastebin.com/27adxKTM
>
> This seems to be resolved by CASSANDRA-8072
>
> My question is whether there is workaround ?
> If not, when can I expect 2.1.13 release ?
>
> Thanks
>


>>>
>>
>


Re: Latest stable release

2016-02-08 Thread Carlos Rolo
I honestly go with 2.1.13 unless you need the features on 2.2.x.

I would not recommend 3.x for now (unless you need the features).


Regards,

Carlos Juzarte Rolo
Cassandra Consultant / Datastax Certified Architect / Cassandra MVP

Pythian - Love your data

rolo@pythian | Twitter: @cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
*
Mobile: +351 91 891 81 00 | Tel: +1 613 565 8696 x1649
www.pythian.com

On Mon, Feb 8, 2016 at 9:34 PM, Ravi Krishna 
wrote:

> We are starting a new project in Cassandra. Is 3.2 stable enough to be
> used in production. If not, which is the most stable version in 2.x.
>
> thanks.
>

-- 


--





Re: CASSANDRA-8072

2016-02-08 Thread Ted Yu
The issue I described was observed on the seed node.

Both rpc_address and listen_address point to localhost.

bq. What addresses are there in the seed list?

The IP of the seed node.

I haven't come to starting non-seed node(s) yet.

Thanks for the quick response.

On Mon, Feb 8, 2016 at 10:50 AM, Bhuvan Rawal  wrote:

> Hi Ted,
>
> Have you specified the listen_address and rpc_address? What addresses are
> there in the seed list?
>
> Have you started seed first and after waiting for 30 seconds started other
> nodes?
>
>
> On Tue, Feb 9, 2016 at 12:14 AM, Ted Yu  wrote:
>
>> Hi,
>> I am trying to setup a cluster with DSE 4.8.4
>>
>> I added the following in resources/cassandra/conf/cassandra.yaml :
>>
>> cluster_name: 'cass'
>>
>> which resulted in:
>>
>> http://pastebin.com/27adxKTM
>>
>> This seems to be resolved by CASSANDRA-8072
>>
>> My question is whether there is workaround ?
>> If not, when can I expect 2.1.13 release ?
>>
>> Thanks
>>
>
>


Re: CASSANDRA-8072

2016-02-08 Thread Bhuvan Rawal
could you paste your cassandra.yaml here, except for commented out lines?

On Tue, Feb 9, 2016 at 12:30 AM, Ted Yu  wrote:

> The issue I described was observed on the seed node.
>
> Both rpc_address and listen_address point to localhost.
>
> bq. What addresses are there in the seed list?
>
> The IP of the seed node.
>
> I haven't come to starting non-seed node(s) yet.
>
> Thanks for the quick response.
>
> On Mon, Feb 8, 2016 at 10:50 AM, Bhuvan Rawal  wrote:
>
>> Hi Ted,
>>
>> Have you specified the listen_address and rpc_address? What addresses are
>> there in the seed list?
>>
>> Have you started seed first and after waiting for 30 seconds started
>> other nodes?
>>
>>
>> On Tue, Feb 9, 2016 at 12:14 AM, Ted Yu  wrote:
>>
>>> Hi,
>>> I am trying to setup a cluster with DSE 4.8.4
>>>
>>> I added the following in resources/cassandra/conf/cassandra.yaml :
>>>
>>> cluster_name: 'cass'
>>>
>>> which resulted in:
>>>
>>> http://pastebin.com/27adxKTM
>>>
>>> This seems to be resolved by CASSANDRA-8072
>>>
>>> My question is whether there is workaround ?
>>> If not, when can I expect 2.1.13 release ?
>>>
>>> Thanks
>>>
>>
>>
>


Can we set TTL on individual fields (columns) using the Datastax java-driver

2016-02-08 Thread Ajay Garg
Something like ::


##
class A {

  @Id
  @Column (name = "pojo_key")
  int key;

  @Ttl(10)
  @Column (name = "pojo_temporary_guest")
  String guest;

}
##


When I persist, let's say value "ajay" in guest-field (pojo_temporary_guest
column), it stays forever, and does not become "null" after 10 seconds.

Kindly point me what I am doing wrong.
I will be grateful.


Thanks and Regards,
Ajay


Re: CASSANDRA-8072

2016-02-08 Thread Ted Yu
If I apply the fix from CASSANDRA-8072 onto a 2.1.12 cluster, which files
should I replace ?

Thanks

On Mon, Feb 8, 2016 at 1:07 PM, Bhuvan Rawal  wrote:

> Your config looks fine to me,  i tried reproducing the scenario by setting
> localhost in listen_address,rpc_address and seed list, and it worked fine,
> I had earlier the node local ip in the 3 fields and it was working fine.
>
> Looks like there is some other issue here.
>
> On Tue, Feb 9, 2016 at 12:49 AM, Ted Yu  wrote:
>
>> Here it is:
>> http://pastebin.com/QEdjtAj6
>>
>> XX.YY is localhost in this case.
>>
>> On Mon, Feb 8, 2016 at 11:03 AM, Bhuvan Rawal 
>> wrote:
>>
>>> could you paste your cassandra.yaml here, except for commented out lines?
>>>
>>> On Tue, Feb 9, 2016 at 12:30 AM, Ted Yu  wrote:
>>>
 The issue I described was observed on the seed node.

 Both rpc_address and listen_address point to localhost.

 bq. What addresses are there in the seed list?

 The IP of the seed node.

 I haven't come to starting non-seed node(s) yet.

 Thanks for the quick response.

 On Mon, Feb 8, 2016 at 10:50 AM, Bhuvan Rawal 
 wrote:

> Hi Ted,
>
> Have you specified the listen_address and rpc_address? What addresses
> are there in the seed list?
>
> Have you started seed first and after waiting for 30 seconds started
> other nodes?
>
>
> On Tue, Feb 9, 2016 at 12:14 AM, Ted Yu  wrote:
>
>> Hi,
>> I am trying to setup a cluster with DSE 4.8.4
>>
>> I added the following in resources/cassandra/conf/cassandra.yaml :
>>
>> cluster_name: 'cass'
>>
>> which resulted in:
>>
>> http://pastebin.com/27adxKTM
>>
>> This seems to be resolved by CASSANDRA-8072
>>
>> My question is whether there is workaround ?
>> If not, when can I expect 2.1.13 release ?
>>
>> Thanks
>>
>
>

>>>
>>
>


Re: Latest stable release

2016-02-08 Thread Jack Krupansky
2.1.x and 2.2.x are certainly stable, but... they will only be supported
until November. The new tick-tock release strategy is designed to ensure
stability. 3.3, which is (non-critical) bug fixes to 3.2, should be out
shortly (vote in progress.) 3.4, with features such as SASI, will probably
be out a month or so later.

-- Jack Krupansky

On Mon, Feb 8, 2016 at 4:54 PM, Will Hayworth 
wrote:

> We're having good luck running 3.2.1 in production, but ours is a small
> cluster and we're very new at this. :)
>
> ___
> Will Hayworth
> Developer, Engagement Engine
> Atlassian
>
> My pronoun is "they". 
>
>
>
> On Mon, Feb 8, 2016 at 1:34 PM, Ravi Krishna 
> wrote:
>
>> We are starting a new project in Cassandra. Is 3.2 stable enough to be
>> used in production. If not, which is the most stable version in 2.x.
>>
>> thanks.
>>
>
>


Re: Writing a large blob returns WriteTimeoutException

2016-02-08 Thread Giampaolo Trapasso
I write at every step MyConfig.blobsize number of bytes, that I configured
to be from 10 to 100. This allows me to "simulate" the writing of a
600Mb file, as configuration on github (
https://github.com/giampaolotrapasso/cassandratest/blob/master/src/main/resources/application.conf


*)*
 Giampaolo

2016-02-08 23:25 GMT+01:00 Jack Krupansky :

> You appear to be writing the entire bob on each chunk rather than the
> slice of the blob.
>
> -- Jack Krupansky
>
> On Mon, Feb 8, 2016 at 1:45 PM, Giampaolo Trapasso <
> giampaolo.trapa...@radicalbit.io> wrote:
>
>> Hi to all,
>>
>> I'm trying to put a large binary file (> 500MB) on a C* cluster as fast
>> as I can but I get some (many) WriteTimeoutExceptions.
>>
>> I created a small POC that isolates the problem I'm facing. Here you will
>> find the code: https://github.com/giampaolotrapasso/cassandratest,
>>
>> *Main details about it:*
>>
>>- I try to write the file into chunks (*data* field) <= 1MB (1MB is
>>recommended max size for single cell),
>>
>>
>>- Chunks are grouped into buckets. Every bucket is a partition row,
>>- Buckets are grouped by UUIDs.
>>
>>
>>- Chunk size and bucket size are configurable from app so I can try
>>different configurations and see what happens.
>>
>>
>>- Trying to max throughput, I execute asynch insertions, however to
>>avoid too much pressure on the db, after a threshold, I wait at least for 
>> a
>>finished insert to add another (this part is quite raw in my code but I
>>think it's not so important). Also this parameter is configurable to test
>>different combinations.
>>
>> This is the table on db:
>>
>> CREATE TABLE blobtest.store (
>> uuid uuid,
>> bucket bigint,
>> start bigint,
>> data blob,
>> end bigint,
>> PRIMARY KEY ((uuid, bucket), start)
>> )
>>
>> and this is the main code (Scala, but I hope is be generally readable)
>>
>> val statement = client.session.prepare("INSERT INTO
>> blobTest.store(uuid, bucket, start, end, data) VALUES (?, ?, ?, ?, ?) if
>> not exists;")
>>
>> val blob = new Array[Byte](MyConfig.blobSize)
>> scala.util.Random.nextBytes(blob)
>>
>> write(client,
>>   numberOfRecords = MyConfig.recordNumber,
>>   bucketSize = MyConfig.bucketSize,
>>   maxConcurrentWrites = MyConfig.maxFutures,
>>   blob,
>>   statement)
>>
>> where write is
>>
>> def write(database: Database, numberOfRecords: Int, bucketSize: Int,
>> maxConcurrentWrites: Int,
>> blob: Array[Byte], statement: PreparedStatement): Unit = {
>>
>> val uuid: UUID = UUID.randomUUID()
>> var count = 0;
>>
>> //Javish loop
>> while (count < numberOfRecords) {
>>   val record = Record(
>> uuid = uuid,
>> bucket = count / bucketSize,
>> start = ((count % bucketSize)) * blob.length,
>> end = ((count % bucketSize) + 1) * blob.length,
>> bytes = blob
>>   )
>>   asynchWrite(database, maxConcurrentWrites, statement, record)
>>   count += 1
>> }
>>
>> waitDbWrites()
>>   }
>>
>> and asynchWrite is just binding to statement
>>
>> *Problem*
>>
>> The problem is that when I try to increase the chunck size, or the number
>> of asynch insert or the size of the bucket (ie number of chuncks), app
>> become unstable since the db starts throwing WriteTimeoutException.
>>
>> I've tested the stuff on the CCM (4 nodes) and on a EC2 cluster (5 nodes,
>> 8GB Heap). Problem seems the same on both enviroments.
>>
>> On my local cluster, I've tried to change respect to default
>> configuration:
>>
>> concurrent_writes: 128
>>
>> write_request_timeout_in_ms: 20
>>
>> other configurations are here:
>> https://gist.github.com/giampaolotrapasso/ca21a83befd339075e07
>>
>> *Other*
>>
>> Exceptions seems random, sometimes are at the beginning of the write
>>
>> *Questions:*
>>
>> 1. Is my model wrong? Am I missing some important detail?
>>
>> 2. What are the important information to look at for this kind of problem?
>>
>> 3. Why exceptions are so random?
>>
>> 4. There is some other C* parameter I can set to assure that
>> WriteTimeoutException does not occur?
>>
>> I hope I provided enough information to get some help.
>>
>> Thank you in advance for any reply.
>>
>>
>> Giampaolo
>>
>>
>>
>>
>>
>>
>>
>>
>


Re: Cassandra Collections performance issue

2016-02-08 Thread Agrawal, Pratik
Hello all,

Recently we added one of the table fields from as Map in Cassandra 
2.1.11. Currently we read every field from Map and overwrite map values. Map is 
of size 3. We saw that writes are 30-40% slower while reads are 70-80% slower. 
Please find below some metrics that can help.

My question is, Are there any known issues in Cassandra map performance?  As I 
understand it each of the CQL3 Map entry, maps to a column in cassandra, with 
that assumption we are just creating 3 columns right? Any insight on this issue 
would be helpful.

Datastax Java Driver 2.1.6.
Machine: Amazon C3 2x large
CPU – pretty much same as before (around 30%)
Memory – max around 4.8 GB

CFSTATS:

Keyspace: Keyspace
Read Count: 28359044
Read Latency: 2.847392469259542 ms.
Write Count: 1152765
Write Latency: 0.14778018590085576 ms.
Pending Flushes: 0
Table: table1
SSTable count: 1
SSTables in each level: [1, 0, 0, 0, 0, 0, 0, 0, 0]
Space used (live): 4119699
Space used (total): 4119699
Space used by snapshots (total): 90323640
Off heap memory used (total): 2278
SSTable Compression Ratio: 0.23172161124142604
Number of keys (estimate): 14
Memtable cell count: 6437
Memtable data size: 872912
Memtable off heap memory used: 0
Memtable switch count: 7626
Local read count: 27754634
Local read latency: 1.921 ms
Local write count: 1113668
Local write latency: 0.142 ms
Pending flushes: 0
Bloom filter false positives: 0
Bloom filter false ratio: 0.0
Bloom filter space used: 96
Bloom filter off heap memory used: 88
Index summary off heap memory used: 46
Compression metadata off heap memory used: 2144
Compacted partition minimum bytes: 315853
Compacted partition maximum bytes: 4055269
Compacted partition mean bytes: 2444011
Average live cells per slice (last five minutes): 17.536775249005437
Maximum live cells per slice (last five minutes): 1225.0
Average tombstones per slice (last five minutes): 34.99979575985972
Maximum tombstones per slice (last five minutes): 3430.0

Table: table2
SSTable count: 1
SSTables in each level: [1, 0, 0, 0, 0, 0, 0, 0, 0]
Space used (live): 869900
Space used (total): 869900
Space used by snapshots (total): 17279824
Off heap memory used (total): 387
SSTable Compression Ratio: 0.3999013540551859
Number of keys (estimate): 2
Memtable cell count: 1958
Memtable data size: 8
Memtable off heap memory used: 0
Memtable switch count: 7484
Local read count: 604412
Local read latency: 45.421 ms
Local write count: 39097
Local write latency: 0.337 ms
Pending flushes: 0
Bloom filter false positives: 0
Bloom filter false ratio: 0.0
Bloom filter space used: 96
Bloom filter off heap memory used: 88
Index summary off heap memory used: 35
Compression metadata off heap memory used: 264
Compacted partition minimum bytes: 1955667
Compacted partition maximum bytes: 2346799
Compacted partition mean bytes: 2346799
Average live cells per slice (last five minutes): 1963.0632242863855
Maximum live cells per slice (last five minutes): 5001.0
Average tombstones per slice (last five minutes): 0.0
Maximum tombstones per slice (last five minutes): 0.0

NETSTATS:
Mode: NORMAL
Not sending any streams.
Read Repair Statistics:
Attempted: 2853996
Mismatch (Blocking): 67386
Mismatch (Background): 9233
Pool NameActive   Pending  Completed
Commandsn/a 0   33953165
Responses   n/a 0 370301

IOSTAT
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
  15.200.830.560.100.04   83.27

Device:tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
xvda  2.79 0.4769.86 553719   82619304
xvdb 14.49 3.39   775.564009600  917227536
xvdc 15.13 2.98   819.933522250  969708944
dm-0 49.67 6.36  1595.497525858 1886936320

TPSTAT:
Pool NameActive   Pending  Completed   Blocked  All 
time blocked
MutationStage 0 01199683 0  
   0
ReadStage 0 0   28449207 0  
   0
RequestResponseStage  0 0   33983356 0  
   0
ReadRepairStage   0 02865749 0  
   0
CounterMutationStage  0 0  0 0  
   0
MiscStage 0 0  0 0  
   0
HintedHandoff 0 0  2 0  
   0
GossipStage   0 0 270364 0  
   0
CacheCleanupExecutor  0 0  0 0  
   0
InternalResponseStage 0 0  0 0  
   0
CommitLogArchiver 0 0  0 0  
   0

Re: Writing a large blob returns WriteTimeoutException

2016-02-08 Thread Jack Krupansky
You appear to be writing the entire bob on each chunk rather than the slice
of the blob.

-- Jack Krupansky

On Mon, Feb 8, 2016 at 1:45 PM, Giampaolo Trapasso <
giampaolo.trapa...@radicalbit.io> wrote:

> Hi to all,
>
> I'm trying to put a large binary file (> 500MB) on a C* cluster as fast as
> I can but I get some (many) WriteTimeoutExceptions.
>
> I created a small POC that isolates the problem I'm facing. Here you will
> find the code: https://github.com/giampaolotrapasso/cassandratest,
>
> *Main details about it:*
>
>- I try to write the file into chunks (*data* field) <= 1MB (1MB is
>recommended max size for single cell),
>
>
>- Chunks are grouped into buckets. Every bucket is a partition row,
>- Buckets are grouped by UUIDs.
>
>
>- Chunk size and bucket size are configurable from app so I can try
>different configurations and see what happens.
>
>
>- Trying to max throughput, I execute asynch insertions, however to
>avoid too much pressure on the db, after a threshold, I wait at least for a
>finished insert to add another (this part is quite raw in my code but I
>think it's not so important). Also this parameter is configurable to test
>different combinations.
>
> This is the table on db:
>
> CREATE TABLE blobtest.store (
> uuid uuid,
> bucket bigint,
> start bigint,
> data blob,
> end bigint,
> PRIMARY KEY ((uuid, bucket), start)
> )
>
> and this is the main code (Scala, but I hope is be generally readable)
>
> val statement = client.session.prepare("INSERT INTO
> blobTest.store(uuid, bucket, start, end, data) VALUES (?, ?, ?, ?, ?) if
> not exists;")
>
> val blob = new Array[Byte](MyConfig.blobSize)
> scala.util.Random.nextBytes(blob)
>
> write(client,
>   numberOfRecords = MyConfig.recordNumber,
>   bucketSize = MyConfig.bucketSize,
>   maxConcurrentWrites = MyConfig.maxFutures,
>   blob,
>   statement)
>
> where write is
>
> def write(database: Database, numberOfRecords: Int, bucketSize: Int,
> maxConcurrentWrites: Int,
> blob: Array[Byte], statement: PreparedStatement): Unit = {
>
> val uuid: UUID = UUID.randomUUID()
> var count = 0;
>
> //Javish loop
> while (count < numberOfRecords) {
>   val record = Record(
> uuid = uuid,
> bucket = count / bucketSize,
> start = ((count % bucketSize)) * blob.length,
> end = ((count % bucketSize) + 1) * blob.length,
> bytes = blob
>   )
>   asynchWrite(database, maxConcurrentWrites, statement, record)
>   count += 1
> }
>
> waitDbWrites()
>   }
>
> and asynchWrite is just binding to statement
>
> *Problem*
>
> The problem is that when I try to increase the chunck size, or the number
> of asynch insert or the size of the bucket (ie number of chuncks), app
> become unstable since the db starts throwing WriteTimeoutException.
>
> I've tested the stuff on the CCM (4 nodes) and on a EC2 cluster (5 nodes,
> 8GB Heap). Problem seems the same on both enviroments.
>
> On my local cluster, I've tried to change respect to default
> configuration:
>
> concurrent_writes: 128
>
> write_request_timeout_in_ms: 20
>
> other configurations are here:
> https://gist.github.com/giampaolotrapasso/ca21a83befd339075e07
>
> *Other*
>
> Exceptions seems random, sometimes are at the beginning of the write
>
> *Questions:*
>
> 1. Is my model wrong? Am I missing some important detail?
>
> 2. What are the important information to look at for this kind of problem?
>
> 3. Why exceptions are so random?
>
> 4. There is some other C* parameter I can set to assure that
> WriteTimeoutException does not occur?
>
> I hope I provided enough information to get some help.
>
> Thank you in advance for any reply.
>
>
> Giampaolo
>
>
>
>
>
>
>
>


Re: Latest stable release

2016-02-08 Thread Will Hayworth
We're having good luck running 3.2.1 in production, but ours is a small
cluster and we're very new at this. :)

___
Will Hayworth
Developer, Engagement Engine
Atlassian

My pronoun is "they". 



On Mon, Feb 8, 2016 at 1:34 PM, Ravi Krishna 
wrote:

> We are starting a new project in Cassandra. Is 3.2 stable enough to be
> used in production. If not, which is the most stable version in 2.x.
>
> thanks.
>


Re: Writing a large blob returns WriteTimeoutException

2016-02-08 Thread Jim Ancona
The "if not exists" in your INSERT means that you are incurring a
performance hit by using Paxos. Do you need that? Have you tried your test
without  it?

Jim


Re: CASSANDRA-8072

2016-02-08 Thread Stefania Alborghetti
CASSANDRA-8072 is not going to help you because the code that fails
(checkForEndpointCollision()) should not execute for seeds.

I think the problem is that there are no seeds in cassandra.yaml:

- seeds: "XX.YY"

If listen_address is localhost then try:

- seeds: "127.0.0.1"


On Tue, Feb 9, 2016 at 5:58 AM, Ted Yu  wrote:

> If I apply the fix from CASSANDRA-8072 onto a 2.1.12 cluster, which files
> should I replace ?
>
> Thanks
>
> On Mon, Feb 8, 2016 at 1:07 PM, Bhuvan Rawal  wrote:
>
>> Your config looks fine to me,  i tried reproducing the scenario by
>> setting localhost in listen_address,rpc_address and seed list, and it
>> worked fine, I had earlier the node local ip in the 3 fields and it was
>> working fine.
>>
>> Looks like there is some other issue here.
>>
>> On Tue, Feb 9, 2016 at 12:49 AM, Ted Yu  wrote:
>>
>>> Here it is:
>>> http://pastebin.com/QEdjtAj6
>>>
>>> XX.YY is localhost in this case.
>>>
>>> On Mon, Feb 8, 2016 at 11:03 AM, Bhuvan Rawal 
>>> wrote:
>>>
 could you paste your cassandra.yaml here, except for commented out
 lines?

 On Tue, Feb 9, 2016 at 12:30 AM, Ted Yu  wrote:

> The issue I described was observed on the seed node.
>
> Both rpc_address and listen_address point to localhost.
>
> bq. What addresses are there in the seed list?
>
> The IP of the seed node.
>
> I haven't come to starting non-seed node(s) yet.
>
> Thanks for the quick response.
>
> On Mon, Feb 8, 2016 at 10:50 AM, Bhuvan Rawal 
> wrote:
>
>> Hi Ted,
>>
>> Have you specified the listen_address and rpc_address? What addresses
>> are there in the seed list?
>>
>> Have you started seed first and after waiting for 30 seconds started
>> other nodes?
>>
>>
>> On Tue, Feb 9, 2016 at 12:14 AM, Ted Yu  wrote:
>>
>>> Hi,
>>> I am trying to setup a cluster with DSE 4.8.4
>>>
>>> I added the following in resources/cassandra/conf/cassandra.yaml :
>>>
>>> cluster_name: 'cass'
>>>
>>> which resulted in:
>>>
>>> http://pastebin.com/27adxKTM
>>>
>>> This seems to be resolved by CASSANDRA-8072
>>>
>>> My question is whether there is workaround ?
>>> If not, when can I expect 2.1.13 release ?
>>>
>>> Thanks
>>>
>>
>>
>

>>>
>>
>


-- 


[image: datastax_logo.png] 

Stefania Alborghetti

Apache Cassandra Software Engineer

|+852 6114 9265| stefania.alborghe...@datastax.com


Re: Writing a large blob returns WriteTimeoutException

2016-02-08 Thread Jack Krupansky
I'm a little lost now. Where are you specifying chunk size, which is what
should be varying, as opposed to blob size? And what exactly is the number
of records? Seems like you should be computing number of chunks from blob
size divided by chunk size. And it still seems like you are writing the
same data for each chunk.

-- Jack Krupansky

On Mon, Feb 8, 2016 at 5:34 PM, Giampaolo Trapasso <
giampaolo.trapa...@radicalbit.io> wrote:

> I write at every step MyConfig.blobsize number of bytes, that I configured
> to be from 10 to 100. This allows me to "simulate" the writing of a
> 600Mb file, as configuration on github (
> https://github.com/giampaolotrapasso/cassandratest/blob/master/src/main/resources/application.conf
>
>
> *)*
>  Giampaolo
>
> 2016-02-08 23:25 GMT+01:00 Jack Krupansky :
>
>> You appear to be writing the entire bob on each chunk rather than the
>> slice of the blob.
>>
>> -- Jack Krupansky
>>
>> On Mon, Feb 8, 2016 at 1:45 PM, Giampaolo Trapasso <
>> giampaolo.trapa...@radicalbit.io> wrote:
>>
>>> Hi to all,
>>>
>>> I'm trying to put a large binary file (> 500MB) on a C* cluster as fast
>>> as I can but I get some (many) WriteTimeoutExceptions.
>>>
>>> I created a small POC that isolates the problem I'm facing. Here you
>>> will find the code: https://github.com/giampaolotrapasso/cassandratest,
>>>
>>> *Main details about it:*
>>>
>>>- I try to write the file into chunks (*data* field) <= 1MB (1MB is
>>>recommended max size for single cell),
>>>
>>>
>>>- Chunks are grouped into buckets. Every bucket is a partition row,
>>>- Buckets are grouped by UUIDs.
>>>
>>>
>>>- Chunk size and bucket size are configurable from app so I can try
>>>different configurations and see what happens.
>>>
>>>
>>>- Trying to max throughput, I execute asynch insertions, however to
>>>avoid too much pressure on the db, after a threshold, I wait at least 
>>> for a
>>>finished insert to add another (this part is quite raw in my code but I
>>>think it's not so important). Also this parameter is configurable to test
>>>different combinations.
>>>
>>> This is the table on db:
>>>
>>> CREATE TABLE blobtest.store (
>>> uuid uuid,
>>> bucket bigint,
>>> start bigint,
>>> data blob,
>>> end bigint,
>>> PRIMARY KEY ((uuid, bucket), start)
>>> )
>>>
>>> and this is the main code (Scala, but I hope is be generally readable)
>>>
>>> val statement = client.session.prepare("INSERT INTO
>>> blobTest.store(uuid, bucket, start, end, data) VALUES (?, ?, ?, ?, ?) if
>>> not exists;")
>>>
>>> val blob = new Array[Byte](MyConfig.blobSize)
>>> scala.util.Random.nextBytes(blob)
>>>
>>> write(client,
>>>   numberOfRecords = MyConfig.recordNumber,
>>>   bucketSize = MyConfig.bucketSize,
>>>   maxConcurrentWrites = MyConfig.maxFutures,
>>>   blob,
>>>   statement)
>>>
>>> where write is
>>>
>>> def write(database: Database, numberOfRecords: Int, bucketSize: Int,
>>> maxConcurrentWrites: Int,
>>> blob: Array[Byte], statement: PreparedStatement): Unit = {
>>>
>>> val uuid: UUID = UUID.randomUUID()
>>> var count = 0;
>>>
>>> //Javish loop
>>> while (count < numberOfRecords) {
>>>   val record = Record(
>>> uuid = uuid,
>>> bucket = count / bucketSize,
>>> start = ((count % bucketSize)) * blob.length,
>>> end = ((count % bucketSize) + 1) * blob.length,
>>> bytes = blob
>>>   )
>>>   asynchWrite(database, maxConcurrentWrites, statement, record)
>>>   count += 1
>>> }
>>>
>>> waitDbWrites()
>>>   }
>>>
>>> and asynchWrite is just binding to statement
>>>
>>> *Problem*
>>>
>>> The problem is that when I try to increase the chunck size, or the
>>> number of asynch insert or the size of the bucket (ie number of chuncks),
>>> app become unstable since the db starts throwing WriteTimeoutException.
>>>
>>> I've tested the stuff on the CCM (4 nodes) and on a EC2 cluster (5
>>> nodes, 8GB Heap). Problem seems the same on both enviroments.
>>>
>>> On my local cluster, I've tried to change respect to default
>>> configuration:
>>>
>>> concurrent_writes: 128
>>>
>>> write_request_timeout_in_ms: 20
>>>
>>> other configurations are here:
>>> https://gist.github.com/giampaolotrapasso/ca21a83befd339075e07
>>>
>>> *Other*
>>>
>>> Exceptions seems random, sometimes are at the beginning of the write
>>>
>>> *Questions:*
>>>
>>> 1. Is my model wrong? Am I missing some important detail?
>>>
>>> 2. What are the important information to look at for this kind of
>>> problem?
>>>
>>> 3. Why exceptions are so random?
>>>
>>> 4. There is some other C* parameter I can set to assure that
>>> WriteTimeoutException does not occur?
>>>
>>> I hope I provided enough information to get some help.
>>>
>>> Thank you in advance for any reply.
>>>
>>>
>>> Giampaolo
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>


Re: Writing a large blob returns WriteTimeoutException

2016-02-08 Thread Giampaolo Trapasso
Sorry Jack for my poor description,
I write 600 times the same array of 1M of bytes to make my life easier.
This allows me to simulate a 600Mb file. It's just a simplification.
Instead of generating 600Mb random array (or reading a real 600Mb file),
and dividing it into 600 chunks, I write the same random array 600 times.
Every chunk corresponds to data field in the table. I realize that blob
parameter of write method can lead to confusion (going to update on github
at least)

I think that the content of the file is not important for the test itself,
I just need 1MB of data to be written. Let me know if there are some other
unclear spots.

giampaolo


2016-02-09 1:28 GMT+01:00 Jack Krupansky :

> I'm a little lost now. Where are you specifying chunk size, which is what
> should be varying, as opposed to blob size? And what exactly is the number
> of records? Seems like you should be computing number of chunks from blob
> size divided by chunk size. And it still seems like you are writing the
> same data for each chunk.
>
> -- Jack Krupansky
>
> On Mon, Feb 8, 2016 at 5:34 PM, Giampaolo Trapasso <
> giampaolo.trapa...@radicalbit.io> wrote:
>
>> I write at every step MyConfig.blobsize number of bytes, that I
>> configured to be from 10 to 100. This allows me to "simulate" the
>> writing of a 600Mb file, as configuration on github (
>> https://github.com/giampaolotrapasso/cassandratest/blob/master/src/main/resources/application.conf
>>
>>
>> *)*
>>  Giampaolo
>>
>> 2016-02-08 23:25 GMT+01:00 Jack Krupansky :
>>
>>> You appear to be writing the entire bob on each chunk rather than the
>>> slice of the blob.
>>>
>>> -- Jack Krupansky
>>>
>>> On Mon, Feb 8, 2016 at 1:45 PM, Giampaolo Trapasso <
>>> giampaolo.trapa...@radicalbit.io> wrote:
>>>
 Hi to all,

 I'm trying to put a large binary file (> 500MB) on a C* cluster as fast
 as I can but I get some (many) WriteTimeoutExceptions.

 I created a small POC that isolates the problem I'm facing. Here you
 will find the code: https://github.com/giampaolotrapasso/cassandratest,


 *Main details about it:*

- I try to write the file into chunks (*data* field) <= 1MB (1MB is
recommended max size for single cell),


- Chunks are grouped into buckets. Every bucket is a partition row,
- Buckets are grouped by UUIDs.


- Chunk size and bucket size are configurable from app so I can try
different configurations and see what happens.


- Trying to max throughput, I execute asynch insertions, however to
avoid too much pressure on the db, after a threshold, I wait at least 
 for a
finished insert to add another (this part is quite raw in my code but I
think it's not so important). Also this parameter is configurable to 
 test
different combinations.

 This is the table on db:

 CREATE TABLE blobtest.store (
 uuid uuid,
 bucket bigint,
 start bigint,
 data blob,
 end bigint,
 PRIMARY KEY ((uuid, bucket), start)
 )

 and this is the main code (Scala, but I hope is be generally readable)

 val statement = client.session.prepare("INSERT INTO
 blobTest.store(uuid, bucket, start, end, data) VALUES (?, ?, ?, ?, ?) if
 not exists;")

 val blob = new Array[Byte](MyConfig.blobSize)
 scala.util.Random.nextBytes(blob)

 write(client,
   numberOfRecords = MyConfig.recordNumber,
   bucketSize = MyConfig.bucketSize,
   maxConcurrentWrites = MyConfig.maxFutures,
   blob,
   statement)

 where write is

 def write(database: Database, numberOfRecords: Int, bucketSize: Int,
 maxConcurrentWrites: Int,
 blob: Array[Byte], statement: PreparedStatement): Unit = {

 val uuid: UUID = UUID.randomUUID()
 var count = 0;

 //Javish loop
 while (count < numberOfRecords) {
   val record = Record(
 uuid = uuid,
 bucket = count / bucketSize,
 start = ((count % bucketSize)) * blob.length,
 end = ((count % bucketSize) + 1) * blob.length,
 bytes = blob
   )
   asynchWrite(database, maxConcurrentWrites, statement, record)
   count += 1
 }

 waitDbWrites()
   }

 and asynchWrite is just binding to statement

 *Problem*

 The problem is that when I try to increase the chunck size, or the
 number of asynch insert or the size of the bucket (ie number of chuncks),
 app become unstable since the db starts throwing WriteTimeoutException.

 I've tested the stuff on the CCM (4 nodes) and on a EC2 cluster (5
 nodes, 8GB Heap). Problem seems the same on both enviroments.

 On my local cluster, I've 

Re: CASSANDRA-8072

2016-02-08 Thread Ted Yu
Thanks for the help, Stefania.
By using "127.0.0.1" , I was able to start Cassandra on that seed node
(XX.YY).
However, on other nodes, I pointed seed to XX.YY and observed the following
?
What did I miss ?


INFO  [main] 2016-02-08 16:44:56,607  OutboundTcpConnection.java:97 -
OutboundTcpConnection using coalescing strategy DISABLED
ERROR [main] 2016-02-08 16:45:27,626  CassandraDaemon.java:581 - Exception
encountered during startup
java.lang.RuntimeException: Unable to gossip with any seeds
at
org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1337)
~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
at
org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:541)
~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
at
org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:789)
~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
at
org.apache.cassandra.service.StorageService.initServer(StorageService.java:721)
~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
at
org.apache.cassandra.service.StorageService.initServer(StorageService.java:612)
~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
at
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:389)
~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
at com.datastax.bdp.server.DseDaemon.setup(DseDaemon.java:335)
~[dse-core-4.8.4.jar:4.8.4]
at
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:564)
~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
at com.datastax.bdp.DseModule.main(DseModule.java:74)
[dse-core-4.8.4.jar:4.8.4]
INFO  [Thread-2] 2016-02-08 16:45:27,629  DseDaemon.java:418 - DSE shutting
down...

On Mon, Feb 8, 2016 at 4:25 PM, Stefania Alborghetti <
stefania.alborghe...@datastax.com> wrote:

> CASSANDRA-8072 is not going to help you because the code that fails
> (checkForEndpointCollision()) should not execute for seeds.
>
> I think the problem is that there are no seeds in cassandra.yaml:
>
> - seeds: "XX.YY"
>
> If listen_address is localhost then try:
>
> - seeds: "127.0.0.1"
>
>
> On Tue, Feb 9, 2016 at 5:58 AM, Ted Yu  wrote:
>
>> If I apply the fix from CASSANDRA-8072 onto a 2.1.12 cluster, which
>> files should I replace ?
>>
>> Thanks
>>
>> On Mon, Feb 8, 2016 at 1:07 PM, Bhuvan Rawal  wrote:
>>
>>> Your config looks fine to me,  i tried reproducing the scenario by
>>> setting localhost in listen_address,rpc_address and seed list, and it
>>> worked fine, I had earlier the node local ip in the 3 fields and it was
>>> working fine.
>>>
>>> Looks like there is some other issue here.
>>>
>>> On Tue, Feb 9, 2016 at 12:49 AM, Ted Yu  wrote:
>>>
 Here it is:
 http://pastebin.com/QEdjtAj6

 XX.YY is localhost in this case.

 On Mon, Feb 8, 2016 at 11:03 AM, Bhuvan Rawal 
 wrote:

> could you paste your cassandra.yaml here, except for commented out
> lines?
>
> On Tue, Feb 9, 2016 at 12:30 AM, Ted Yu  wrote:
>
>> The issue I described was observed on the seed node.
>>
>> Both rpc_address and listen_address point to localhost.
>>
>> bq. What addresses are there in the seed list?
>>
>> The IP of the seed node.
>>
>> I haven't come to starting non-seed node(s) yet.
>>
>> Thanks for the quick response.
>>
>> On Mon, Feb 8, 2016 at 10:50 AM, Bhuvan Rawal 
>> wrote:
>>
>>> Hi Ted,
>>>
>>> Have you specified the listen_address and rpc_address? What
>>> addresses are there in the seed list?
>>>
>>> Have you started seed first and after waiting for 30 seconds started
>>> other nodes?
>>>
>>>
>>> On Tue, Feb 9, 2016 at 12:14 AM, Ted Yu  wrote:
>>>
 Hi,
 I am trying to setup a cluster with DSE 4.8.4

 I added the following in resources/cassandra/conf/cassandra.yaml :

 cluster_name: 'cass'

 which resulted in:

 http://pastebin.com/27adxKTM

 This seems to be resolved by CASSANDRA-8072

 My question is whether there is workaround ?
 If not, when can I expect 2.1.13 release ?

 Thanks

>>>
>>>
>>
>

>>>
>>
>
>
> --
>
>
> [image: datastax_logo.png] 
>
> Stefania Alborghetti
>
> Apache Cassandra Software Engineer
>
> |+852 6114 9265| stefania.alborghe...@datastax.com
>
>
>
>


Re: Writing a large blob returns WriteTimeoutException

2016-02-08 Thread Jack Krupansky
Bucket size is not disclosed. My recommendation is that partitions not be
more than about 10 MB (some people say 100MB or 50MB.)

I think I'd recommend a smaller chunk size, like 128K or 256K. I would note
that Mongo's GridFS uses 256K chunks.

I don't know enough about the finer nuances of Cassandra internal row
management to know whether your chunks should be a little less than some
power of 2 so that a single row is not just over a power of 2 in size.

You may need more heap as well. Maybe you are hitting a high rate of GC
that may cause timeout.

-- Jack Krupansky

On Mon, Feb 8, 2016 at 7:46 PM, Giampaolo Trapasso <
giampaolo.trapa...@radicalbit.io> wrote:

> Sorry Jack for my poor description,
> I write 600 times the same array of 1M of bytes to make my life easier.
> This allows me to simulate a 600Mb file. It's just a simplification.
> Instead of generating 600Mb random array (or reading a real 600Mb file),
> and dividing it into 600 chunks, I write the same random array 600 times.
> Every chunk corresponds to data field in the table. I realize that blob
> parameter of write method can lead to confusion (going to update on github
> at least)
>
> I think that the content of the file is not important for the test itself,
> I just need 1MB of data to be written. Let me know if there are some other
> unclear spots.
>
> giampaolo
>
>
> 2016-02-09 1:28 GMT+01:00 Jack Krupansky :
>
>> I'm a little lost now. Where are you specifying chunk size, which is what
>> should be varying, as opposed to blob size? And what exactly is the number
>> of records? Seems like you should be computing number of chunks from blob
>> size divided by chunk size. And it still seems like you are writing the
>> same data for each chunk.
>>
>> -- Jack Krupansky
>>
>> On Mon, Feb 8, 2016 at 5:34 PM, Giampaolo Trapasso <
>> giampaolo.trapa...@radicalbit.io> wrote:
>>
>>> I write at every step MyConfig.blobsize number of bytes, that I
>>> configured to be from 10 to 100. This allows me to "simulate" the
>>> writing of a 600Mb file, as configuration on github (
>>> https://github.com/giampaolotrapasso/cassandratest/blob/master/src/main/resources/application.conf
>>>
>>>
>>> *)*
>>>  Giampaolo
>>>
>>> 2016-02-08 23:25 GMT+01:00 Jack Krupansky :
>>>
 You appear to be writing the entire bob on each chunk rather than the
 slice of the blob.

 -- Jack Krupansky

 On Mon, Feb 8, 2016 at 1:45 PM, Giampaolo Trapasso <
 giampaolo.trapa...@radicalbit.io> wrote:

> Hi to all,
>
> I'm trying to put a large binary file (> 500MB) on a C* cluster as
> fast as I can but I get some (many) WriteTimeoutExceptions.
>
> I created a small POC that isolates the problem I'm facing. Here you
> will find the code: https://github.com/giampaolotrapasso/cassandratest,
>
>
> *Main details about it:*
>
>- I try to write the file into chunks (*data* field) <= 1MB (1MB
>is recommended max size for single cell),
>
>
>- Chunks are grouped into buckets. Every bucket is a partition row,
>- Buckets are grouped by UUIDs.
>
>
>- Chunk size and bucket size are configurable from app so I can
>try different configurations and see what happens.
>
>
>- Trying to max throughput, I execute asynch insertions, however
>to avoid too much pressure on the db, after a threshold, I wait at 
> least
>for a finished insert to add another (this part is quite raw in my 
> code but
>I think it's not so important). Also this parameter is configurable to 
> test
>different combinations.
>
> This is the table on db:
>
> CREATE TABLE blobtest.store (
> uuid uuid,
> bucket bigint,
> start bigint,
> data blob,
> end bigint,
> PRIMARY KEY ((uuid, bucket), start)
> )
>
> and this is the main code (Scala, but I hope is be generally readable)
>
> val statement = client.session.prepare("INSERT INTO
> blobTest.store(uuid, bucket, start, end, data) VALUES (?, ?, ?, ?, ?) if
> not exists;")
>
> val blob = new Array[Byte](MyConfig.blobSize)
> scala.util.Random.nextBytes(blob)
>
> write(client,
>   numberOfRecords = MyConfig.recordNumber,
>   bucketSize = MyConfig.bucketSize,
>   maxConcurrentWrites = MyConfig.maxFutures,
>   blob,
>   statement)
>
> where write is
>
> def write(database: Database, numberOfRecords: Int, bucketSize: Int,
> maxConcurrentWrites: Int,
> blob: Array[Byte], statement: PreparedStatement): Unit = {
>
> val uuid: UUID = UUID.randomUUID()
> var count = 0;
>
> //Javish loop
> while (count < numberOfRecords) {
>   val record = Record(
> uuid = uuid,
> bucket = 

Re: SELECT JSON timestamp lacks timezone information

2016-02-08 Thread Stefania Alborghetti
It's cqlsh that converts timestamps to UTC and adds the timezone but for
JSON it can't do that because the conversion to JSON is done by Cassandra.

I've filed https://issues.apache.org/jira/browse/CASSANDRA-11137 to discuss
further.

On Mon, Feb 8, 2016 at 7:53 PM, Alexandre Dutra <
alexandre.du...@datastax.com> wrote:

> Sorry,
>
> I mistakenly thought that we were on the Java driver mailing list, my
> apologies. I also think you should definitely file a Jira ticket and ask
> JSON timestamps generated server-side to be 1) formatted with a format that
> mentions the timezone and 2) formatted preferably with UTC, not the JVM
> default timezone.
>
> Alexandre
>
> On Mon, Feb 8, 2016 at 12:23 PM Ralf Steppacher 
> wrote:
>
>> Hi Alexandre.
>>
>> I wrote to ‘user@cassandra.apache.org’.
>>
>> Re the actual problem: I am aware of the fact that C* does not store
>> (need not store) the timezone as it is persisted as a Unix epoche
>> timestamp. Not delivering a timezone in the JSON text representation would
>> be OKish if the text representation would be guaranteed to be in UTC. But
>> it is not. It is in some timezone determined by the locale of the server
>> side or that of the client VM. That way it is a pain in two ways as
>>
>> a) I have to add the timezone in a post-processing step to all timestamps
>> in my JSON responses and
>> b) I also have to do some guesswork at what the actual timezone might be
>>
>> If there is no way to control the formatting of JSON timestamps and to
>> add the time zone information, then IMHO that is bug. Is it not? Or am I
>> missing something here?
>>
>>
>> Thanks!
>> Ralf
>>
>>
>> On 08.02.2016, at 12:06, Alexandre Dutra 
>> wrote:
>>
>> Hello Ralf,
>>
>> First of all, Cassandra stores timestamps without timezone information,
>> so it's not possible to retrieve the original timezone used when inserting
>> the value.
>>
>> CQLSH uses the python driver behind the scenes, and my guess is that the
>> timestamp formatting is being done driver-side – hence the timezone – while
>> when you call toJson(), the formatting has to be done server-side.
>>
>> That said, it does seem that Cassandra is using a format without timezone
>> when converting timestamps to JSON format:
>>
>> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/serializers/TimestampSerializer.java#L52
>>
>> I agree with you that a date format that would include the timezone would
>> be preferable here, but that is a question you should ask in the Cassandra
>> Users mailing list instead.
>>
>> Hope that helps,
>>
>> Alexandre
>>
>>
>>
>>
>> On Mon, Feb 8, 2016 at 11:09 AM Ralf Steppacher 
>> wrote:
>>
>>> Hello all,
>>>
>>> When I select a timestamp as JSON from Cassandra, the string
>>> representation lacks the timezone information, both via CQLSH and the Java
>>> Driver:
>>>
>>> cqlsh:events> select toJson(created_at) AS created_at from
>>> event_by_patient_timestamp ;
>>>
>>>  created_at
>>> ---
>>>  "2016-01-04 16:05:47.123"
>>>
>>> (1 rows)
>>>
>>> vs.
>>>
>>> cqlsh:events> select created_at FROM event_by_user_timestamp ;
>>>
>>>  created_at
>>> --
>>>  2016-01-04 15:05:47+
>>>
>>> (1 rows)
>>> cqlsh:events>
>>>
>>> To make things even more complicated the JSON timestamp is not returned
>>> in UTC. Is there a way to either tell the driver/C* to return the JSON date
>>> in UTC or add the timezone information (much preferred) to the text
>>> representation of the timestamp?
>>>
>>>
>>> Thanks!
>>> Ralf
>>
>> --
>> Alexandre Dutra
>> Driver & Tools Engineer @ DataStax
>>
>>
>> --
> Alexandre Dutra
> Driver & Tools Engineer @ DataStax
>



-- 


[image: datastax_logo.png] 

Stefania Alborghetti

Apache Cassandra Software Engineer

|+852 6114 9265| stefania.alborghe...@datastax.com


Re: Cassandra Collections performance issue

2016-02-08 Thread Robert Coli
On Mon, Feb 8, 2016 at 2:10 PM, Agrawal, Pratik  wrote:

> Recently we added one of the table fields from as Map in 
> *Cassandra
> 2.1.11*. Currently we read every field from Map and overwrite map values.
> Map is of size 3. We saw that writes are 30-40% slower while reads are
> 70-80% slower. Please find below some metrics that can help.
>
> My question is, Are there any known issues in Cassandra map performance?
> As I understand it each of the CQL3 Map entry, maps to a column in
> cassandra, with that assumption we are just creating 3 columns right? Any
> insight on this issue would be helpful.
>

I have previously heard reports along similar lines, but in the other
direction.

eg - "I moved from a collection to a TEXT column with JSON in it, and my
reads and writes both became much faster!"

I'm not sure if the issue has been raised as an Apache Cassandra Jira, iow
if it is a known and expected limitation as opposed to just a performance
issue.

If I were you, I would consider filing a repro case as a Jira ticket, and
responding to this thread with its URL. :D

=Rob