Re: Creating 'Put' requests
There's also Achilles: https://github.com/doanduyhai/Achilles On Fri, Apr 24, 2015 at 1:21 PM Jens Rantil jens.ran...@tink.se wrote: Matthew, Maybe this could also be of interest: http://projects.spring.io/spring-data-cassandra/ Cheers, Jens On Fri, Apr 24, 2015 at 12:50 PM, Phil Yang ud1...@gmail.com wrote: 2015-04-23 22:16 GMT+08:00 Matthew Johnson matt.john...@algomi.com: In HBase, we do something like: Put put = new Put(id); put.add(myPojo.getTimestamp(), myPojo.getValue()); put.add(myPojo.getMySecondTimestamp(), myPojo.getSecondValue()); server.put(put); Is there any similar mechanism in Cassandra Java driver for creating these inserts programmatically? Or, can the 'session.execute' take a list of commands so that each column can be inserted as its own insert statement but without the overhead of multiple calls to the server? For your first question, do you mean object-mapping API? http://docs.datastax.com/en/developer/java-driver/2.1/java-driver/reference/crudOperations.html For the second question, C* can execute several commands by unlogged batch, however, because of the distributed nature of Cassandra, there is a better solution, see https://medium.com/@foundev/cassandra-batch-loading-without-the-batch-keyword-40f00e35e23e Thanks! Matt -Original Message- From: Jim Witschey [mailto:jim.witsc...@datastax.com] Sent: 23 April 2015 14:46 To: user@cassandra.apache.org Subject: Re: Creating 'Put' requests Are prepared statements what you're looking for? http://docs.datastax.com/en/developer/java-driver/2.1/java-driver/quick_start/qsSimpleClientBoundStatements_t.html Jim Witschey Software Engineer in Test | jim.witsc...@datastax.com On Thu, Apr 23, 2015 at 9:28 AM, Matthew Johnson matt.john...@algomi.com wrote: Hi all, Currently looking at switching from HBase to Cassandra, and one big difference so far is that in HBase, we create a ‘Put’ object, add to it a set of column/value pairs, and send the Put to the server. So far in Cassandra 2.1.4 the tutorials seem to suggest using CQL3, which I really like for prototyping eg: session.execute(INSERT INTO simplex.playlists (id, song_id, title, album, artist) VALUES (1,1,'La Petite Tonkinoise','Bye Bye Blackbird','Joséphine Baker');); But for more complicated code this will quickly become unmanageable, and doesn’t lend itself well to dynamically creating row data based on various conditions. Is there a way to send a Java object, populated with the desired column/value pairs, to the server instead of executing an insert statement? Would this require some other library, or does the DataStax Java driver support this already? Thanks in advance, Matt -- Thanks, Phil Yang -- Jens Rantil Backend engineer Tink AB Email: jens.ran...@tink.se Phone: +46 708 84 18 32 Web: www.tink.se Facebook https://www.facebook.com/#!/tink.se Linkedin http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_phototrkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary Twitter https://twitter.com/tink
Re: Creating 'Put' requests
To add to Phil's point, there's no circumstance in which I would use an unlogged batch, under load I have yet to hear it do anything other than increase GC pauses. On Fri, Apr 24, 2015 at 11:50 AM Phil Yang ud1...@gmail.com wrote: 2015-04-23 22:16 GMT+08:00 Matthew Johnson matt.john...@algomi.com: In HBase, we do something like: Put put = new Put(id); put.add(myPojo.getTimestamp(), myPojo.getValue()); put.add(myPojo.getMySecondTimestamp(), myPojo.getSecondValue()); server.put(put); Is there any similar mechanism in Cassandra Java driver for creating these inserts programmatically? Or, can the 'session.execute' take a list of commands so that each column can be inserted as its own insert statement but without the overhead of multiple calls to the server? For your first question, do you mean object-mapping API? http://docs.datastax.com/en/developer/java-driver/2.1/java-driver/reference/crudOperations.html For the second question, C* can execute several commands by unlogged batch, however, because of the distributed nature of Cassandra, there is a better solution, see https://medium.com/@foundev/cassandra-batch-loading-without-the-batch-keyword-40f00e35e23e Thanks! Matt -Original Message- From: Jim Witschey [mailto:jim.witsc...@datastax.com] Sent: 23 April 2015 14:46 To: user@cassandra.apache.org Subject: Re: Creating 'Put' requests Are prepared statements what you're looking for? http://docs.datastax.com/en/developer/java-driver/2.1/java-driver/quick_start/qsSimpleClientBoundStatements_t.html Jim Witschey Software Engineer in Test | jim.witsc...@datastax.com On Thu, Apr 23, 2015 at 9:28 AM, Matthew Johnson matt.john...@algomi.com wrote: Hi all, Currently looking at switching from HBase to Cassandra, and one big difference so far is that in HBase, we create a ‘Put’ object, add to it a set of column/value pairs, and send the Put to the server. So far in Cassandra 2.1.4 the tutorials seem to suggest using CQL3, which I really like for prototyping eg: session.execute(INSERT INTO simplex.playlists (id, song_id, title, album, artist) VALUES (1,1,'La Petite Tonkinoise','Bye Bye Blackbird','Joséphine Baker');); But for more complicated code this will quickly become unmanageable, and doesn’t lend itself well to dynamically creating row data based on various conditions. Is there a way to send a Java object, populated with the desired column/value pairs, to the server instead of executing an insert statement? Would this require some other library, or does the DataStax Java driver support this already? Thanks in advance, Matt -- Thanks, Phil Yang
RE: Creating 'Put' requests
The object-mapping API is very interesting, I’ll check that out, thanks. I believe I have found what I was looking for in terms of programmatically inserting data using the following syntax: * Insert builder = QueryBuilder.insertInto(**simplex**, * *mytable1**);* * builder = builder.value(**id**, **myid2**);* * builder = builder.value(**title**, **mytitle2**);* *session**.execute(builder);* Many thanks for all the valuable help so far! Cheers, Matt *From:* Jonathan Haddad [mailto:j...@jonhaddad.com] *Sent:* 24 April 2015 14:15 *To:* user@cassandra.apache.org *Subject:* Re: Creating 'Put' requests To add to Phil's point, there's no circumstance in which I would use an unlogged batch, under load I have yet to hear it do anything other than increase GC pauses. On Fri, Apr 24, 2015 at 11:50 AM Phil Yang ud1...@gmail.com wrote: 2015-04-23 22:16 GMT+08:00 Matthew Johnson matt.john...@algomi.com: In HBase, we do something like: Put put = new Put(id); put.add(myPojo.getTimestamp(), myPojo.getValue()); put.add(myPojo.getMySecondTimestamp(), myPojo.getSecondValue()); server.put(put); Is there any similar mechanism in Cassandra Java driver for creating these inserts programmatically? Or, can the 'session.execute' take a list of commands so that each column can be inserted as its own insert statement but without the overhead of multiple calls to the server? For your first question, do you mean object-mapping API? http://docs.datastax.com/en/developer/java-driver/2.1/java-driver/reference/crudOperations.html For the second question, C* can execute several commands by unlogged batch, however, because of the distributed nature of Cassandra, there is a better solution, see https://medium.com/@foundev/cassandra-batch-loading-without-the-batch-keyword-40f00e35e23e Thanks! Matt -Original Message- From: Jim Witschey [mailto:jim.witsc...@datastax.com] Sent: 23 April 2015 14:46 To: user@cassandra.apache.org Subject: Re: Creating 'Put' requests Are prepared statements what you're looking for? http://docs.datastax.com/en/developer/java-driver/2.1/java-driver/quick_start/qsSimpleClientBoundStatements_t.html Jim Witschey Software Engineer in Test | jim.witsc...@datastax.com On Thu, Apr 23, 2015 at 9:28 AM, Matthew Johnson matt.john...@algomi.com wrote: Hi all, Currently looking at switching from HBase to Cassandra, and one big difference so far is that in HBase, we create a ‘Put’ object, add to it a set of column/value pairs, and send the Put to the server. So far in Cassandra 2.1.4 the tutorials seem to suggest using CQL3, which I really like for prototyping eg: session.execute(INSERT INTO simplex.playlists (id, song_id, title, album, artist) VALUES (1,1,'La Petite Tonkinoise','Bye Bye Blackbird','Joséphine Baker');); But for more complicated code this will quickly become unmanageable, and doesn’t lend itself well to dynamically creating row data based on various conditions. Is there a way to send a Java object, populated with the desired column/value pairs, to the server instead of executing an insert statement? Would this require some other library, or does the DataStax Java driver support this already? Thanks in advance, Matt -- Thanks, Phil Yang
RE: Creating 'Put' requests
Hi Jim, I think I have found what I was looking for here: https://gist.github.com/yangzhe1991/10349122 I would end up with code that looks something like this: *public* *void** createSchema() {* * System.**out**.println(**CREATING SCHEMA**);* * Create createTable = SchemaBuilder.createTable(**simplex**, **mytable1**);* * createTable = createTable.ifNotExists();* * createTable = createTable.addPartitionKey(**id**, DataType.text());* * createTable = createTable.addColumn(**title**, DataType.text());* * createTable = createTable.addColumn(**author**, DataType.text());* *session**.execute(createTable);* * System.**out**.println(**SCHEMA CREATED**);* * }* *public* *void** loadData() {* * System.**out**.println(**LOADING DATA**);* * Insert builder = QueryBuilder.insertInto(**simplex**, * *mytable1**);* * builder = builder.value(**id**, **myid2**);* * builder = builder.value(**title**, **mytitle2**);* * builder = builder.value(**author**, **myauthor2**);* * builder = builder.value(**author2**, **myauthor2_2**);* *session**.execute(builder);* * System.**out**.println(**DATA LOADED**);* * }* But do let me know if you know of any problems (performance or otherwise) with this approach. I am using a relatively new version of datastax connector (Cassandra-driver-core-2.1.5) and none of these methods are deprecated so I am assuming they are ok to use in conjunction with CQL3. Unfortunately it seems that I was misinformed on the “dynamically creating timeseries columns” feature, and that this WAS deprecated in CQL3 – in order to dynamically create columns I would have to issue an ‘ALTER TABLE’ statement for every new column. I read one suggestions which is to use collections instead - so basically have a single pre-defined column which is a Map, say, and then add ‘timestamp : value’ into that map instead of a new column for every timestamp. Would you say this is an acceptable approach? Many thanks, Matt PS apologies for the noobness!! -Original Message- From: Matthew Johnson [mailto:matt.john...@algomi.com] Sent: 23 April 2015 15:16 To: user@cassandra.apache.org Subject: RE: Creating 'Put' requests Hi Jim, This would still involve either having a fixed(ish) schema, with a handful of pre-written prepared statements that I fill the values into, or some rather horrific StringBuilder that generates the statement based on some logic. Prepared Statements work great, for example, for inserting users where the columns are known eg 'firstname, lastname, postcode', but what about when you want to add timeseries data with the timestamp as the column? I would have to do something like (ignore incorrect syntax for now): String myQuery = INSERT INTO myKeyspace.myTable (id, + myPojo.getTimestamp() + , + myPojo.getMySecondTimestamp() + ) VALUES (?,?, ?);; Session.execute(boundStatement.bind(row1, myPojo.getValue(),myPojo.getSecondValue()); Which is already a bit ugly, but when you start talking about a handful or a few dozen columns, it will become unmanageable. In HBase, we do something like: Put put = new Put(id); put.add(myPojo.getTimestamp(), myPojo.getValue()); put.add(myPojo.getMySecondTimestamp(), myPojo.getSecondValue()); server.put(put); Is there any similar mechanism in Cassandra Java driver for creating these inserts programmatically? Or, can the 'session.execute' take a list of commands so that each column can be inserted as its own insert statement but without the overhead of multiple calls to the server? Thanks! Matt -Original Message- From: Jim Witschey [mailto:jim.witsc...@datastax.com jim.witsc...@datastax.com] Sent: 23 April 2015 14:46 To: user@cassandra.apache.org Subject: Re: Creating 'Put' requests Are prepared statements what you're looking for? http://docs.datastax.com/en/developer/java-driver/2.1/java-driver/quick_start/qsSimpleClientBoundStatements_t.html Jim Witschey Software Engineer in Test | jim.witsc...@datastax.com On Thu, Apr 23, 2015 at 9:28 AM, Matthew Johnson matt.john...@algomi.com wrote: Hi all, Currently looking at switching from HBase to Cassandra, and one big difference so far is that in HBase, we create a ‘Put’ object, add to it a set of column/value pairs, and send the Put to the server. So far in Cassandra 2.1.4 the tutorials seem to suggest using CQL3, which I really like for prototyping eg: session.execute(INSERT INTO simplex.playlists (id, song_id, title, album, artist) VALUES (1,1,'La Petite Tonkinoise','Bye Bye Blackbird','Joséphine Baker');); But for more complicated code this will quickly become unmanageable, and doesn’t lend
Re: Creating 'Put' requests
On Thu, Apr 23, 2015 at 8:50 AM, Matthew Johnson matt.john...@algomi.com wrote: Unfortunately it seems that I was misinformed on the “dynamically creating timeseries columns” feature, and that this WAS deprecated in CQL3 – in order to dynamically create columns I would have to issue an ‘ALTER TABLE’ statement for every new column. I read one suggestions which is to use collections instead - so basically have a single pre-defined column which is a Map, say, and then add ‘timestamp : value’ into that map instead of a new column for every timestamp. Would you say this is an acceptable approach? Depending on the data model and the queries your application will use, you'll be using either clustering columns or collections (or a combination). If you need help modeling, you could start a new thread with the relevant details and I'm pretty sure you'll get some good suggestions here. -- Bests, Alex Popescu | @al3xandru Sen. Product Manager @ DataStax
RE: Creating 'Put' requests
Hi Jim, This would still involve either having a fixed(ish) schema, with a handful of pre-written prepared statements that I fill the values into, or some rather horrific StringBuilder that generates the statement based on some logic. Prepared Statements work great, for example, for inserting users where the columns are known eg 'firstname, lastname, postcode', but what about when you want to add timeseries data with the timestamp as the column? I would have to do something like (ignore incorrect syntax for now): String myQuery = INSERT INTO myKeyspace.myTable (id, + myPojo.getTimestamp() + , + myPojo.getMySecondTimestamp() + ) VALUES (?,?, ?);; Session.execute(boundStatement.bind(row1, myPojo.getValue(),myPojo.getSecondValue()); Which is already a bit ugly, but when you start talking about a handful or a few dozen columns, it will become unmanageable. In HBase, we do something like: Put put = new Put(id); put.add(myPojo.getTimestamp(), myPojo.getValue()); put.add(myPojo.getMySecondTimestamp(), myPojo.getSecondValue()); server.put(put); Is there any similar mechanism in Cassandra Java driver for creating these inserts programmatically? Or, can the 'session.execute' take a list of commands so that each column can be inserted as its own insert statement but without the overhead of multiple calls to the server? Thanks! Matt -Original Message- From: Jim Witschey [mailto:jim.witsc...@datastax.com] Sent: 23 April 2015 14:46 To: user@cassandra.apache.org Subject: Re: Creating 'Put' requests Are prepared statements what you're looking for? http://docs.datastax.com/en/developer/java-driver/2.1/java-driver/quick_start/qsSimpleClientBoundStatements_t.html Jim Witschey Software Engineer in Test | jim.witsc...@datastax.com On Thu, Apr 23, 2015 at 9:28 AM, Matthew Johnson matt.john...@algomi.com wrote: Hi all, Currently looking at switching from HBase to Cassandra, and one big difference so far is that in HBase, we create a ‘Put’ object, add to it a set of column/value pairs, and send the Put to the server. So far in Cassandra 2.1.4 the tutorials seem to suggest using CQL3, which I really like for prototyping eg: session.execute(INSERT INTO simplex.playlists (id, song_id, title, album, artist) VALUES (1,1,'La Petite Tonkinoise','Bye Bye Blackbird','Joséphine Baker');); But for more complicated code this will quickly become unmanageable, and doesn’t lend itself well to dynamically creating row data based on various conditions. Is there a way to send a Java object, populated with the desired column/value pairs, to the server instead of executing an insert statement? Would this require some other library, or does the DataStax Java driver support this already? Thanks in advance, Matt
Re: Creating 'Put' requests
Are prepared statements what you're looking for? http://docs.datastax.com/en/developer/java-driver/2.1/java-driver/quick_start/qsSimpleClientBoundStatements_t.html Jim Witschey Software Engineer in Test | jim.witsc...@datastax.com On Thu, Apr 23, 2015 at 9:28 AM, Matthew Johnson matt.john...@algomi.com wrote: Hi all, Currently looking at switching from HBase to Cassandra, and one big difference so far is that in HBase, we create a ‘Put’ object, add to it a set of column/value pairs, and send the Put to the server. So far in Cassandra 2.1.4 the tutorials seem to suggest using CQL3, which I really like for prototyping eg: session.execute(INSERT INTO simplex.playlists (id, song_id, title, album, artist) VALUES (1,1,'La Petite Tonkinoise','Bye Bye Blackbird','Joséphine Baker');); But for more complicated code this will quickly become unmanageable, and doesn’t lend itself well to dynamically creating row data based on various conditions. Is there a way to send a Java object, populated with the desired column/value pairs, to the server instead of executing an insert statement? Would this require some other library, or does the DataStax Java driver support this already? Thanks in advance, Matt