[ https://issues.apache.org/jira/browse/CASSANDRA-9773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aleksey Yeschenko resolved CASSANDRA-9773. ------------------------------------------ Resolution: Won't Fix Fix Version/s: (was: 2.2.x) > Hadoop Cassandra integration - cannot output to table with only primary key > columns > ----------------------------------------------------------------------------------- > > Key: CASSANDRA-9773 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9773 > Project: Cassandra > Issue Type: Bug > Environment: Cassandra 2.0.13, Hadoop 1.0.4 > Reporter: fuggy_yama > > I have following table in cassandra: > {code:sql}CREATE TABLE IF NOT EXISTS summary > ( > it int, > id int, > x float, > y float, > PRIMARY KEY (it, id, x, y) > ) WITH compact storage{code} > In hadoop job definition i set output/update query: > {code:java}String outputQuery = "UPDATE " + params.get("output_keyspace") + > "." + params.get("output_column_family") + " SET x=?, y=?"; > CqlConfigHelper.setOutputCql(job.getConfiguration(), outputQuery);{code} > When hadoop job wants to write results from reducers to cassandra then I get > this exception: > {code:java}java.io.IOException: java.lang.RuntimeException: failed to prepare > cql query UPDATE kmeans_out_cs.summary SET x=?, y=? WHERE "it" = ? AND "id" = > ? AND "x" = ? AND "y" = ? > at > org.apache.cassandra.hadoop.cql3.CqlRecordWriter$RangeClient.run(CqlRecordWriter.java:256) > Caused by: java.lang.RuntimeException: failed to prepare cql query UPDATE > kmeans_out_cs.summary SET x=?, y=? WHERE "it" = ? AND "id" = ? AND "x" = ? > AND "y" = ? > at > org.apache.cassandra.hadoop.cql3.CqlRecordWriter$RangeClient.preparedStatement(CqlRecordWriter.java:300) > at > org.apache.cassandra.hadoop.cql3.CqlRecordWriter$RangeClient.run(CqlRecordWriter.java:237) > Caused by: InvalidRequestException(why:PRIMARY KEY part x found in SET part) > at > org.apache.cassandra.thrift.Cassandra$prepare_cql3_query_result$prepare_cql3_query_resultStandardScheme.read(Cassandra.java:51017) > at > org.apache.cassandra.thrift.Cassandra$prepare_cql3_query_result$prepare_cql3_query_resultStandardScheme.read(Cassandra.java:50994) > at > org.apache.cassandra.thrift.Cassandra$prepare_cql3_query_result.read(Cassandra.java:50933) > at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) > at > org.apache.cassandra.thrift.Cassandra$Client.recv_prepare_cql3_query(Cassandra.java:1756) > at > org.apache.cassandra.thrift.Cassandra$Client.prepare_cql3_query(Cassandra.java:1742) > at > org.apache.cassandra.hadoop.cql3.CqlRecordWriter$RangeClient.preparedStatement(CqlRecordWriter.java:296) > ... 1 more{code} > When we want to insert/update columns from PK definition then there is a > conflict in generated CQL query (x and y columns appear in SET and WHERE > coulses...): > *UPDATE kmeans_out_cs.summary SET x=?, y=? WHERE "it" = ? AND "id" = ? AND > "x" = ? AND "y" = ?* > *Can hadoop job write data to a cassandra table that has only PRIMARY KEY > columns?* > *UPDATE1* > I checked the source code and noticed that the above update cql query > actually has to be an update statement (not insert). > Update statement syntax requires non empty "SET a=b" clause so there is no > way to avoid column names duplication in final update query -- This message was sent by Atlassian JIRA (v6.3.4#6332)