[jira] [Commented] (CASSANDRA-4450) CQL3: Allow preparing the consistency level, timestamp and ttl

2013-05-27 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667515#comment-13667515
 ] 

Sylvain Lebresne commented on CASSANDRA-4450:
-

bq. I'd rather see surrogate ks/cf names here

I hesitated. But yeah, you're probably right that it's better. I've pushed an 
additional commit on the branch with that.

 CQL3: Allow preparing the consistency level, timestamp and ttl
 --

 Key: CASSANDRA-4450
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4450
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Minor
  Labels: cql3
 Fix For: 2.0


 It could be useful to allow the preparation of the consitency level, the 
 timestamp and the ttl. I.e. to allow:
 {noformat}
 UPDATE foo SET .. USING CONSISTENCY ? AND TIMESTAMP ? AND TTL ? 
 {noformat}
 A slight concern is that when preparing a statement we return the names of 
 the prepared variables, but none of timestamp, ttl and consistency are 
 reserved names currently, so returning those as names could conflict with a 
 column name. We can either:
 * make these reserved identifier (I have to add that I'm not a fan because at 
 least for timestamp, I think that's a potentially useful and common column 
 name).
 * use some specific special character to indicate those are not column names, 
 like returning [timestamp], [ttl], [consistency].

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Git Push Summary

2013-05-27 Thread slebresne
Updated Tags:  refs/tags/1.1.12-tentative [deleted] 2dd73d171


Git Push Summary

2013-05-27 Thread slebresne
Updated Tags:  refs/tags/cassandra-1.1.12 [created] b35284a5f


[jira] [Commented] (CASSANDRA-4693) CQL Protocol should allow multiple PreparedStatements to be atomically executed

2013-05-27 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667621#comment-13667621
 ] 

Sylvain Lebresne commented on CASSANDRA-4693:
-

Committed with the point above fixed. Thanks!

 CQL Protocol should allow multiple PreparedStatements to be atomically 
 executed
 ---

 Key: CASSANDRA-4693
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4693
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Michaël Figuière
Assignee: Sylvain Lebresne
  Labels: cql, protocol
 Fix For: 2.0

 Attachments: 
 0001-Binary-protocol-adds-message-to-batch-prepared-or-not-.txt


 Currently the only way to insert multiple records on the same partition key, 
 atomically and using PreparedStatements is to use a CQL BATCH command. 
 Unfortunately when doing so the amount of records to be inserted must be 
 known prior to prepare the statement which is rarely the case. Thus the only 
 workaround if one want to keep atomicity is currently to use unprepared 
 statements which send a bulk of CQL strings and is fairly inefficient.
 Therefore CQL Protocol should allow clients to send multiple 
 PreparedStatements to be executed with similar guarantees and semantic as CQL 
 BATCH command.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


svn commit: r1486531 - in /cassandra/site: publish/download/index.html src/settings.py

2013-05-27 Thread slebresne
Author: slebresne
Date: Mon May 27 07:55:50 2013
New Revision: 1486531

URL: http://svn.apache.org/r1486531
Log:
Update website for 1.1.12 release

Modified:
cassandra/site/publish/download/index.html
cassandra/site/src/settings.py

Modified: cassandra/site/publish/download/index.html
URL: 
http://svn.apache.org/viewvc/cassandra/site/publish/download/index.html?rev=1486531r1=1486530r2=1486531view=diff
==
--- cassandra/site/publish/download/index.html (original)
+++ cassandra/site/publish/download/index.html Mon May 27 07:55:50 2013
@@ -111,16 +111,16 @@
   p
   Previous stable branches of Cassandra continue to see periodic maintenance
   for some time after a new major release is made. The lastest release on the
-  1.1 branch is 1.1.11 (released on
-  2013-04-19).
+  1.1 branch is 1.1.12 (released on
+  2013-05-27).
   /p
 
   ul
 li
-a class=filename 
href=http://www.apache.org/dyn/closer.cgi?path=/cassandra/1.1.11/apache-cassandra-1.1.11-bin.tar.gz;apache-cassandra-1.1.11-bin.tar.gz/a
-[a 
href=http://www.apache.org/dist/cassandra/1.1.11/apache-cassandra-1.1.11-bin.tar.gz.asc;PGP/a]
-[a 
href=http://www.apache.org/dist/cassandra/1.1.11/apache-cassandra-1.1.11-bin.tar.gz.md5;MD5/a]
-[a 
href=http://www.apache.org/dist/cassandra/1.1.11/apache-cassandra-1.1.11-bin.tar.gz.sha1;SHA1/a]
+a class=filename 
href=http://www.apache.org/dyn/closer.cgi?path=/cassandra/1.1.12/apache-cassandra-1.1.12-bin.tar.gz;apache-cassandra-1.1.12-bin.tar.gz/a
+[a 
href=http://www.apache.org/dist/cassandra/1.1.12/apache-cassandra-1.1.12-bin.tar.gz.asc;PGP/a]
+[a 
href=http://www.apache.org/dist/cassandra/1.1.12/apache-cassandra-1.1.12-bin.tar.gz.md5;MD5/a]
+[a 
href=http://www.apache.org/dist/cassandra/1.1.12/apache-cassandra-1.1.12-bin.tar.gz.sha1;SHA1/a]
 /li
   /ul
   
@@ -163,10 +163,10 @@
 /li
   
 li
-a class=filename 
href=http://www.apache.org/dyn/closer.cgi?path=/cassandra/1.1.11/apache-cassandra-1.1.11-src.tar.gz;apache-cassandra-1.1.11-src.tar.gz/a
-[a 
href=http://www.apache.org/dist/cassandra/1.1.11/apache-cassandra-1.1.11-src.tar.gz.asc;PGP/a]
-[a 
href=http://www.apache.org/dist/cassandra/1.1.11/apache-cassandra-1.1.11-src.tar.gz.md5;MD5/a]
-[a 
href=http://www.apache.org/dist/cassandra/1.1.11/apache-cassandra-1.1.11-src.tar.gz.sha1;SHA1/a]
+a class=filename 
href=http://www.apache.org/dyn/closer.cgi?path=/cassandra/1.1.12/apache-cassandra-1.1.12-src.tar.gz;apache-cassandra-1.1.12-src.tar.gz/a
+[a 
href=http://www.apache.org/dist/cassandra/1.1.12/apache-cassandra-1.1.12-src.tar.gz.asc;PGP/a]
+[a 
href=http://www.apache.org/dist/cassandra/1.1.12/apache-cassandra-1.1.12-src.tar.gz.md5;MD5/a]
+[a 
href=http://www.apache.org/dist/cassandra/1.1.12/apache-cassandra-1.1.12-src.tar.gz.sha1;SHA1/a]
 /li
   
   

Modified: cassandra/site/src/settings.py
URL: 
http://svn.apache.org/viewvc/cassandra/site/src/settings.py?rev=1486531r1=1486530r2=1486531view=diff
==
--- cassandra/site/src/settings.py (original)
+++ cassandra/site/src/settings.py Mon May 27 07:55:50 2013
@@ -92,8 +92,8 @@ SITE_POST_PROCESSORS = {
 }
 
 class CassandraDef(object):
-oldstable_version = '1.1.11'
-oldstable_release_date = '2013-04-19'
+oldstable_version = '1.1.12'
+oldstable_release_date = '2013-05-27'
 oldstable_exists = True
 veryoldstable_version = '1.0.12'
 veryoldstable_release_date = '2012-10-04'




git commit: Fix changelog

2013-05-27 Thread slebresne
Updated Branches:
  refs/heads/trunk 6d04ef038 - 4e2d76b8c


Fix changelog


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4e2d76b8
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4e2d76b8
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4e2d76b8

Branch: refs/heads/trunk
Commit: 4e2d76b8c3040ba437da9efee443158e1688f295
Parents: 6d04ef0
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Mon May 27 09:55:17 2013 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Mon May 27 09:55:17 2013 +0200

--
 CHANGES.txt |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/4e2d76b8/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index e6079a6..47f6f0a 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -52,6 +52,7 @@
  * cqlsh: drop CQL2/CQL3-beta support (CASSANDRA-5585)
  * Track max/min column names in sstables to be able to optimize slice
queries (CASSANDRA-5514)
+ * Binary protocol: allow batching already prepared statements (CASSANDRA-4693)
 
 1.2.6
  * Fix dealing with ridiculously large max sstable sizes in LCS 
(CASSANDRA-5589)



[jira] [Commented] (CASSANDRA-5544) Hadoop jobs assigns only one mapper in task

2013-05-27 Thread Cyril Scetbon (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667633#comment-13667633
 ] 

Cyril Scetbon commented on CASSANDRA-5544:
--

So something goes wrong with 1.2.x version

 Hadoop jobs assigns only one mapper in task 
 

 Key: CASSANDRA-5544
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5544
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.1
 Environment: Red hat linux 5.4, Hadoop 1.0.3, pig 0.11.1
Reporter: Shamim Ahmed
Assignee: Brandon Williams
 Attachments: Screen Shot 2013-05-26 at 4.49.48 PM.png


 We have got very strange beheviour of hadoop cluster after upgrading 
 Cassandra from 1.1.5 to Cassandra 1.2.1. We have 5 nodes cluster of 
 Cassandra, where three of them are hodoop slaves. Now when we are submitting 
 job through Pig script, only one map assigns in task running on one of the 
 hadoop slaves regardless of 
 volume of data (already tried with more than million rows).
 Configure of pig as follows:
 export PIG_HOME=/oracle/pig-0.10.0
 export PIG_CONF_DIR=${HADOOP_HOME}/conf
 export PIG_INITIAL_ADDRESS=192.168.157.103
 export PIG_RPC_PORT=9160
 export PIG_PARTITIONER=org.apache.cassandra.dht.Murmur3Partitioner
 Also we have these following properties in hadoop:
  property
  namemapred.tasktracker.map.tasks.maximum/name
  value10/value
  /property
  property
  namemapred.map.tasks/name
  value4/value
  /property

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CASSANDRA-4905) Repair should exclude gcable tombstones from merkle-tree computation

2013-05-27 Thread Michael Theroux (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667363#comment-13667363
 ] 

Michael Theroux edited comment on CASSANDRA-4905 at 5/27/13 12:55 PM:
--

The long compactions before the fix I think is a byproduct of leveled 
compaction.  I've seen a number of people mention this on the users list.  
Basically, leveled compaction in 1.1 is a single threaded process, and 
increasing the compaction throughput doesn't help its rate.  Leveled compaction 
is very slow to compact.

Leveled compaction should be better than Size Tiered, unless you are doing 
something like major compactions (we are on some tables).

CASSANDRA-5398 looks interesting.  We rolled this fix + 1.1.11 into production 
this weekend.  The last repair was a thing of beauty... finished in under 3 
hours, very little streaming and compaction... as it should be if you have 
don't have any, or very few inconsistencies in your data. Given its running so 
well, I'll leave well-enough alone and not apply 5398.  

We are using RF 3 and the repair was using -pr.

  was (Author: mtheroux2):
The long compactions before the fix I think is a byproduct of leveled 
compaction.  I've seen a number of people mention this on the users list.  
Basically, leveled compaction in 1.1 is a single threaded process, and 
increasing the compaction throughput doesn't help its rate.  Leveled compaction 
is very slow to compact.

Leveled compaction should be better than Size Tiered, unless you are doing 
something like major compactions (we are on some tables).

CASSANDRA-5398 looks interesting.  We rolled this fix + 1.1.11 into production 
this weekend.  The last repair was a thing of beauty... finished in under 3 
hours, very little streaming and compaction... as it should be if you have 
don't have any, or very few inconsistencies in your data. Given its running so 
well, I'll leave well-enough alone and not apply 5398.  
  
 Repair should exclude gcable tombstones from merkle-tree computation
 

 Key: CASSANDRA-4905
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4905
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Christian Spriegel
Assignee: Sylvain Lebresne
 Fix For: 1.2.0 beta 3

 Attachments: 4905.txt


 Currently gcable tombstones get repaired if some replicas compacted already, 
 but some are not compacted.
 This could be avoided by ignoring all gcable tombstones during merkle tree 
 calculation.
 This was discussed with Sylvain on the mailing list:
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/repair-compaction-and-tombstone-rows-td7583481.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4905) Repair should exclude gcable tombstones from merkle-tree computation

2013-05-27 Thread Michael Theroux (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667738#comment-13667738
 ] 

Michael Theroux commented on CASSANDRA-4905:


One additional thought.  It is possible that LeveledCompaction could make this 
issue worse, because it is more efficient in deleting tombstones?  It is more 
efficient, but its not 100%, so is the chance that a tombstone was deleted on 
one node, and not deleted on the other two nodes (in the case of a RF 3), 
actually greater than SizeTiered?  I guess it depends on a lot... just a 
thought.

 Repair should exclude gcable tombstones from merkle-tree computation
 

 Key: CASSANDRA-4905
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4905
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Christian Spriegel
Assignee: Sylvain Lebresne
 Fix For: 1.2.0 beta 3

 Attachments: 4905.txt


 Currently gcable tombstones get repaired if some replicas compacted already, 
 but some are not compacted.
 This could be avoided by ignoring all gcable tombstones during merkle tree 
 calculation.
 This was discussed with Sylvain on the mailing list:
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/repair-compaction-and-tombstone-rows-td7583481.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5587) BulkLoader fails with NoSuchElementException

2013-05-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-5587:
--

Reviewer: dbrosius

 BulkLoader fails with NoSuchElementException
 

 Key: CASSANDRA-5587
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5587
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Affects Versions: 1.2.4, 1.2.5
Reporter: Julien Aymé
  Labels: patch
 Attachments: cassandra-1.2-5587.txt

   Original Estimate: 4h
  Remaining Estimate: 4h

 When using BulkLoader tool (sstableloader command) to transfer data from a 
 cluster to another, 
 a java.util.NoSuchElementException is thrown whenever the directory contains 
 a snapshot sub directory,
 and the bulk load fails.
 The fix should be quite simple:
 Catch any NoSuchElementException thrown in {{SSTableLoader#openSSTables()}}
 The directory structure:
 {noformat}
 user@cassandrasrv01:~$ ls /var/lib/cassandra/data/Keyspace1/CF1/
 Keyspace1-CF1-ib-1872-CompressionInfo.db
 Keyspace1-CF1-ib-1872-Data.db
 Keyspace1-CF1-ib-1872-Filter.db
 Keyspace1-CF1-ib-1872-Index.db
 Keyspace1-CF1-ib-1872-Statistics.db
 Keyspace1-CF1-ib-1872-Summary.db
 Keyspace1-CF1-ib-1872-TOC.txt
 Keyspace1-CF1-ib-2166-CompressionInfo.db
 Keyspace1-CF1-ib-2166-Data.db
 Keyspace1-CF1-ib-2166-Filter.db
 Keyspace1-CF1-ib-2166-Index.db
 Keyspace1-CF1-ib-2166-Statistics.db
 Keyspace1-CF1-ib-2166-Summary.db
 Keyspace1-CF1-ib-2166-TOC.txt
 Keyspace1-CF1-ib-5-CompressionInfo.db
 Keyspace1-CF1-ib-5-Data.db
 Keyspace1-CF1-ib-5-Filter.db
 Keyspace1-CF1-ib-5-Index.db
 Keyspace1-CF1-ib-5-Statistics.db
 Keyspace1-CF1-ib-5-Summary.db
 Keyspace1-CF1-ib-5-TOC.txt
 ...
 snapshots
 {noformat}
 The stacktrace: 
 {noformat}
 user@cassandrasrv01:~$ ./cassandra/bin/sstableloader -v --debug -d 
 cassandrabck01 /var/lib/cassandra/data/Keyspace1/CF1/
 null
 java.util.NoSuchElementException
 at java.util.StringTokenizer.nextToken(StringTokenizer.java:349)
 at 
 org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:265)
 at 
 org.apache.cassandra.io.sstable.Component.fromFilename(Component.java:122)
 at 
 org.apache.cassandra.io.sstable.SSTable.tryComponentFromFilename(SSTable.java:194)
 at 
 org.apache.cassandra.io.sstable.SSTableLoader$1.accept(SSTableLoader.java:71)
 at java.io.File.list(File.java:1087)
 at 
 org.apache.cassandra.io.sstable.SSTableLoader.openSSTables(SSTableLoader.java:67)
 at 
 org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:119)
 at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:67)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5587) BulkLoader fails with NoSuchElementException

2013-05-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-5587:
--

Assignee: Julien Aymé

 BulkLoader fails with NoSuchElementException
 

 Key: CASSANDRA-5587
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5587
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Affects Versions: 1.2.4, 1.2.5
Reporter: Julien Aymé
Assignee: Julien Aymé
  Labels: patch
 Attachments: cassandra-1.2-5587.txt

   Original Estimate: 4h
  Remaining Estimate: 4h

 When using BulkLoader tool (sstableloader command) to transfer data from a 
 cluster to another, 
 a java.util.NoSuchElementException is thrown whenever the directory contains 
 a snapshot sub directory,
 and the bulk load fails.
 The fix should be quite simple:
 Catch any NoSuchElementException thrown in {{SSTableLoader#openSSTables()}}
 The directory structure:
 {noformat}
 user@cassandrasrv01:~$ ls /var/lib/cassandra/data/Keyspace1/CF1/
 Keyspace1-CF1-ib-1872-CompressionInfo.db
 Keyspace1-CF1-ib-1872-Data.db
 Keyspace1-CF1-ib-1872-Filter.db
 Keyspace1-CF1-ib-1872-Index.db
 Keyspace1-CF1-ib-1872-Statistics.db
 Keyspace1-CF1-ib-1872-Summary.db
 Keyspace1-CF1-ib-1872-TOC.txt
 Keyspace1-CF1-ib-2166-CompressionInfo.db
 Keyspace1-CF1-ib-2166-Data.db
 Keyspace1-CF1-ib-2166-Filter.db
 Keyspace1-CF1-ib-2166-Index.db
 Keyspace1-CF1-ib-2166-Statistics.db
 Keyspace1-CF1-ib-2166-Summary.db
 Keyspace1-CF1-ib-2166-TOC.txt
 Keyspace1-CF1-ib-5-CompressionInfo.db
 Keyspace1-CF1-ib-5-Data.db
 Keyspace1-CF1-ib-5-Filter.db
 Keyspace1-CF1-ib-5-Index.db
 Keyspace1-CF1-ib-5-Statistics.db
 Keyspace1-CF1-ib-5-Summary.db
 Keyspace1-CF1-ib-5-TOC.txt
 ...
 snapshots
 {noformat}
 The stacktrace: 
 {noformat}
 user@cassandrasrv01:~$ ./cassandra/bin/sstableloader -v --debug -d 
 cassandrabck01 /var/lib/cassandra/data/Keyspace1/CF1/
 null
 java.util.NoSuchElementException
 at java.util.StringTokenizer.nextToken(StringTokenizer.java:349)
 at 
 org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:265)
 at 
 org.apache.cassandra.io.sstable.Component.fromFilename(Component.java:122)
 at 
 org.apache.cassandra.io.sstable.SSTable.tryComponentFromFilename(SSTable.java:194)
 at 
 org.apache.cassandra.io.sstable.SSTableLoader$1.accept(SSTableLoader.java:71)
 at java.io.File.list(File.java:1087)
 at 
 org.apache.cassandra.io.sstable.SSTableLoader.openSSTables(SSTableLoader.java:67)
 at 
 org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:119)
 at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:67)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4450) CQL3: Allow preparing the consistency level, timestamp and ttl

2013-05-27 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667771#comment-13667771
 ] 

Aleksey Yeschenko commented on CASSANDRA-4450:
--

+1

(there are some leftovers in BatchStatement.Prepared.prepare():
{noformat}
// We use the first statement keyspace and cf. Which may not be 
correct, but batch on the same CF is probably is most common case
// anyway and I suppose it's ok if the info is not perfect in some 
cases.
assert !parsedStatements.isEmpty();
ModificationStatement.Parsed first = parsedStatements.get(0);
{noformat}
)

 CQL3: Allow preparing the consistency level, timestamp and ttl
 --

 Key: CASSANDRA-4450
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4450
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Minor
  Labels: cql3
 Fix For: 2.0


 It could be useful to allow the preparation of the consitency level, the 
 timestamp and the ttl. I.e. to allow:
 {noformat}
 UPDATE foo SET .. USING CONSISTENCY ? AND TIMESTAMP ? AND TTL ? 
 {noformat}
 A slight concern is that when preparing a statement we return the names of 
 the prepared variables, but none of timestamp, ttl and consistency are 
 reserved names currently, so returning those as names could conflict with a 
 column name. We can either:
 * make these reserved identifier (I have to add that I'm not a fan because at 
 least for timestamp, I think that's a potentially useful and common column 
 name).
 * use some specific special character to indicate those are not column names, 
 like returning [timestamp], [ttl], [consistency].

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5455) Remove PBSPredictor

2013-05-27 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667804#comment-13667804
 ] 

Jonathan Ellis commented on CASSANDRA-5455:
---

bq. I've already written an example external module to do RTT/2 predictions

Do we need any core changes at all, then?  (Under the #3 for now plan.)

 Remove PBSPredictor
 ---

 Key: CASSANDRA-5455
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5455
 Project: Cassandra
  Issue Type: Bug
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
 Fix For: 2.0

 Attachments: 5455.txt


 It was a fun experiment, but it's unmaintained and the bar to understanding 
 what is going on is high.  Case in point: PBSTest has been failing 
 intermittently for some time now, possibly even since it was created.  Or 
 possibly not and it was a regression from a refactoring we did.  Who knows?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5388) Unit tests fail due to ant/junit problem

2013-05-27 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667809#comment-13667809
 ] 

Jonathan Ellis commented on CASSANDRA-5388:
---

Ant 1.9.1 was released last week.

 Unit tests fail due to ant/junit problem
 

 Key: CASSANDRA-5388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5388
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 2.0
 Environment: Windows 7 or Linux
 java 1.7.0_17
 ant 1.9.0
Reporter: Ryan McGuire

 Intermittently, but more often than not I get the following error when 
 running 'ant test' on Windows 7 (also encountered on Linux now):
 {code}
 BUILD FAILED
 c:\Users\Ryan\git\cassandra3\build.xml:1121: The following error occurred 
 while executing this line:
 c:\Users\Ryan\git\cassandra3\build.xml:1064: Using loader 
 AntClassLoader[C:\Program 
 Files\Java\apache-ant-1.9.0\lib\ant-launcher.jar;c:\Program 
 Files\Java\apache-ant-1.9.0\lib\ant.jar;c:\Program 
 Files\Java\apache-ant-1.9.0\lib\ant-junit.jar;c:\Program 
 

[jira] [Commented] (CASSANDRA-5543) Ant issues when building gen-cql2-grammar

2013-05-27 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667811#comment-13667811
 ] 

Jonathan Ellis commented on CASSANDRA-5543:
---

Ping [~brandon.williams]

 Ant issues when building gen-cql2-grammar
 -

 Key: CASSANDRA-5543
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5543
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.2.3
Reporter: Joaquin Casares
Priority: Trivial

 Below are the commands and outputs that were returned.
 The first `ant` command fails on gen-cql2-grammar, but if I don't run `ant 
 realclean` then it works fine after a second pass.
 {CODE}
 ubuntu@ip-10-196-153-29:~/.ccm/repository/1.2.3$ ant realclean
 Buildfile: /home/ubuntu/.ccm/repository/1.2.3/build.xml
 clean:
[delete] Deleting directory /home/ubuntu/.ccm/repository/1.2.3/build/test
[delete] Deleting directory 
 /home/ubuntu/.ccm/repository/1.2.3/build/classes
[delete] Deleting directory /home/ubuntu/.ccm/repository/1.2.3/src/gen-java
[delete] Deleting: /home/ubuntu/.ccm/repository/1.2.3/build/internode.avpr
 realclean:
[delete] Deleting directory /home/ubuntu/.ccm/repository/1.2.3/build
 BUILD SUCCESSFUL
 Total time: 0 seconds
 {CODE}
 {CODE}
 ubuntu@ip-10-196-153-29:~/.ccm/repository/1.2.3$ ant
 Buildfile: /home/ubuntu/.ccm/repository/1.2.3/build.xml
 maven-ant-tasks-localrepo:
 maven-ant-tasks-download:
  [echo] Downloading Maven ANT Tasks...
 [mkdir] Created dir: /home/ubuntu/.ccm/repository/1.2.3/build
   [get] Getting: 
 http://repo2.maven.org/maven2/org/apache/maven/maven-ant-tasks/2.1.3/maven-ant-tasks-2.1.3.jar
   [get] To: 
 /home/ubuntu/.ccm/repository/1.2.3/build/maven-ant-tasks-2.1.3.jar
 maven-ant-tasks-init:
 [mkdir] Created dir: /home/ubuntu/.ccm/repository/1.2.3/build/lib
 maven-declare-dependencies:
 maven-ant-tasks-retrieve-build:
 [artifact:dependencies] Downloading: asm/asm/3.2/asm-3.2-sources.jar from 
 repository central at http://repo1.maven.org/maven2
 
 [artifact:dependencies] [INFO] Unable to find resource 
 'hsqldb:hsqldb:java-source:sources:1.8.0.10' in repository java.net2 
 (http://download.java.net/maven/2)
 [artifact:dependencies] Building ant file: 
 /home/ubuntu/.ccm/repository/1.2.3/build/build-dependencies.xml
  [copy] Copying 45 files to 
 /home/ubuntu/.ccm/repository/1.2.3/build/lib/jars
  [copy] Copying 35 files to 
 /home/ubuntu/.ccm/repository/1.2.3/build/lib/sources
 init:
 [mkdir] Created dir: /home/ubuntu/.ccm/repository/1.2.3/build/classes/main
 [mkdir] Created dir: 
 /home/ubuntu/.ccm/repository/1.2.3/build/classes/thrift
 [mkdir] Created dir: /home/ubuntu/.ccm/repository/1.2.3/build/test/lib
 [mkdir] Created dir: /home/ubuntu/.ccm/repository/1.2.3/build/test/classes
 [mkdir] Created dir: /home/ubuntu/.ccm/repository/1.2.3/src/gen-java
 check-avro-generate:
 avro-interface-generate-internode:
  [echo] Generating Avro internode code...
 avro-generate:
 build-subprojects:
 check-gen-cli-grammar:
 gen-cli-grammar:
  [echo] Building Grammar 
 /home/ubuntu/.ccm/repository/1.2.3/src/java/org/apache/cassandra/cli/Cli.g  
 
 check-gen-cql2-grammar:
 gen-cql2-grammar:
  [echo] Building Grammar 
 /home/ubuntu/.ccm/repository/1.2.3/src/java/org/apache/cassandra/cql/Cql.g  
 ...
  [java] warning(200): 
 /home/ubuntu/.ccm/repository/1.2.3/src/java/org/apache/cassandra/cql/Cql.g:479:20:
  Decision can match input such as IDENT using multiple alternatives: 1, 2
  [java] As a result, alternative(s) 2 were disabled for that input
  [java] warning(200): 
 /home/ubuntu/.ccm/repository/1.2.3/src/java/org/apache/cassandra/cql/Cql.g:479:20:
  Decision can match input such as K_KEY using multiple alternatives: 1, 2
  [java] As a result, alternative(s) 2 were disabled for that input
  [java] warning(200): 
 /home/ubuntu/.ccm/repository/1.2.3/src/java/org/apache/cassandra/cql/Cql.g:479:20:
  Decision can match input such as QMARK using multiple alternatives: 1, 2
  [java] As a result, alternative(s) 2 were disabled for that input
  [java] warning(200): 
 /home/ubuntu/.ccm/repository/1.2.3/src/java/org/apache/cassandra/cql/Cql.g:479:20:
  Decision can match input such as FLOAT using multiple alternatives: 1, 2
  [java] As a result, alternative(s) 2 were disabled for that input
  [java] warning(200): 
 /home/ubuntu/.ccm/repository/1.2.3/src/java/org/apache/cassandra/cql/Cql.g:479:20:
  Decision can match input such as STRING_LITERAL using multiple 
 alternatives: 1, 2
  [java] As a result, alternative(s) 2 were disabled for that input
  [java] warning(200): 
 /home/ubuntu/.ccm/repository/1.2.3/src/java/org/apache/cassandra/cql/Cql.g:479:20:
  Decision can match input such as INTEGER using 

[jira] [Updated] (CASSANDRA-5536) ColumnFamilyInputFormat demands OrderPreservingPartitioner when specifying InputRange with tokens

2013-05-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-5536:
--

Reviewer: alexliu68
Assignee: Jonathan Ellis

 ColumnFamilyInputFormat demands OrderPreservingPartitioner when specifying 
 InputRange with tokens
 -

 Key: CASSANDRA-5536
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5536
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.3
Reporter: Lanny Ripple
Assignee: Jonathan Ellis
 Attachments: 5536-v2.txt, cassandra-1.2.3-5536.txt


 When ColumnFamilyInputFormat starts getting splits (via getSplits(...) 
 [ColumnFamilyInputFormat.java:101]) it checks to see if a `jobKeyRange` has 
 been set.  If it has been set it attempts to set the `jobRange`.  However the 
 if block (ColumnFamilyInputFormat.java:124) looks to see if the `jobKeyRange` 
 has tokens but asserts that the OrderPreservingPartitioner must be in use.
 This if block should be looking for keys (not tokens).  Code further down 
 (ColumnFamilyInputFormat.java:147) already manages the range if tokens are 
 used but can never be reached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5536) ColumnFamilyInputFormat demands OrderPreservingPartitioner when specifying InputRange with tokens

2013-05-27 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667816#comment-13667816
 ] 

Jonathan Ellis commented on CASSANDRA-5536:
---

WDYT [~alexliu68]?

 ColumnFamilyInputFormat demands OrderPreservingPartitioner when specifying 
 InputRange with tokens
 -

 Key: CASSANDRA-5536
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5536
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.3
Reporter: Lanny Ripple
Assignee: Jonathan Ellis
 Attachments: 5536-v2.txt, cassandra-1.2.3-5536.txt


 When ColumnFamilyInputFormat starts getting splits (via getSplits(...) 
 [ColumnFamilyInputFormat.java:101]) it checks to see if a `jobKeyRange` has 
 been set.  If it has been set it attempts to set the `jobRange`.  However the 
 if block (ColumnFamilyInputFormat.java:124) looks to see if the `jobKeyRange` 
 has tokens but asserts that the OrderPreservingPartitioner must be in use.
 This if block should be looking for keys (not tokens).  Code further down 
 (ColumnFamilyInputFormat.java:147) already manages the range if tokens are 
 used but can never be reached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5239) Fully Aysnc Server Transport (StorageProxy Layer)

2013-05-27 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667818#comment-13667818
 ] 

Sylvain Lebresne commented on CASSANDRA-5239:
-

bq. Inclined to push this to 2.1, we're getting close to freeze for 2.0.

Probably wise indeed. I, at least, won't have much time to benchmark/profile 
this until the 2.0 freeze.

 Fully Aysnc Server Transport (StorageProxy Layer)
 -

 Key: CASSANDRA-5239
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5239
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 2.0
Reporter: Vijay
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 2.0


 Problem Statement: 
 Currently we have rpc_min_threads, rpc_max_threads/ 
 native_transport_min_threads/native_transport_max_threads all of the 
 threads in the TPE are blocking and takes resources, the threads are mostly 
 sleeping. Increasing the Context switch costs.
 Details: 
 We should change StorageProxy methods to provide a callback which contains 
 the location where the results has to be written. When the response arrive 
 StorageProxy callback can write the results directly into the connection. 
 Timeouts can be handled in the same way.
 Fixing Netty should be trivial with some refactor in the storage proxy 
 (currently it is one method call for sending the request and waiting) we need 
 callback.
 Fixing Thrift may be harder because thrift calls the method and expects a 
 return value. We might need to write a custom Codec on Netty for thrift 
 support, which can potentially do callbacks (A Custom codec may be similar to 
 http://engineering.twitter.com/2011/04/twitter-search-is-now-3x-faster_1656.html
  but we dont know details about it). Another option is to update thrift to 
 have a callback.
 FYI, The motivation for this ticket is from another project which i am 
 working on with similar Proxy (blocking Netty transport) and making it Async 
 gave us 2x throughput improvement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5239) Fully Aysnc Server Transport (StorageProxy Layer)

2013-05-27 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667813#comment-13667813
 ] 

Jonathan Ellis commented on CASSANDRA-5239:
---

Inclined to push this to 2.1, we're getting close to freeze for 2.0.

 Fully Aysnc Server Transport (StorageProxy Layer)
 -

 Key: CASSANDRA-5239
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5239
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 2.0
Reporter: Vijay
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 2.0


 Problem Statement: 
 Currently we have rpc_min_threads, rpc_max_threads/ 
 native_transport_min_threads/native_transport_max_threads all of the 
 threads in the TPE are blocking and takes resources, the threads are mostly 
 sleeping. Increasing the Context switch costs.
 Details: 
 We should change StorageProxy methods to provide a callback which contains 
 the location where the results has to be written. When the response arrive 
 StorageProxy callback can write the results directly into the connection. 
 Timeouts can be handled in the same way.
 Fixing Netty should be trivial with some refactor in the storage proxy 
 (currently it is one method call for sending the request and waiting) we need 
 callback.
 Fixing Thrift may be harder because thrift calls the method and expects a 
 return value. We might need to write a custom Codec on Netty for thrift 
 support, which can potentially do callbacks (A Custom codec may be similar to 
 http://engineering.twitter.com/2011/04/twitter-search-is-now-3x-faster_1656.html
  but we dont know details about it). Another option is to update thrift to 
 have a callback.
 FYI, The motivation for this ticket is from another project which i am 
 working on with similar Proxy (blocking Netty transport) and making it Async 
 gave us 2x throughput improvement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

2013-05-27 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667820#comment-13667820
 ] 

Jonathan Ellis commented on CASSANDRA-2388:
---

bq. The feasible approach hopefully is still T Jake Luciani's above

Okay.  Referring back to Jake's comments,

bq. The biggest problem is [avoiding endpoints in a different DC]. Maybe the 
way todo this is change getSplits logic to never return replicas in another DC. 
I think this would require adding DC info to the describe_ring call

I note that we expose node snitch location in system.peers.  So at worst we 
could join against that manually.

 ColumnFamilyRecordReader fails for a given split because a host is down, even 
 if records could reasonably be read from other replica.
 -

 Key: CASSANDRA-2388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.6
Reporter: Eldon Stegall
Assignee: Mck SembWever
Priority: Minor
  Labels: hadoop, inputformat
 Fix For: 1.2.6

 Attachments: 0002_On_TException_try_next_split.patch, 
 CASSANDRA-2388-addition1.patch, CASSANDRA-2388-extended.patch, 
 CASSANDRA-2388.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, 
 CASSANDRA-2388.patch


 ColumnFamilyRecordReader only tries the first location for a given split. We 
 should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5239) Fully Aysnc Server Transport (StorageProxy Layer)

2013-05-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-5239:
--

Fix Version/s: (was: 2.0)
   2.1

 Fully Aysnc Server Transport (StorageProxy Layer)
 -

 Key: CASSANDRA-5239
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5239
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 2.0
Reporter: Vijay
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 2.1


 Problem Statement: 
 Currently we have rpc_min_threads, rpc_max_threads/ 
 native_transport_min_threads/native_transport_max_threads all of the 
 threads in the TPE are blocking and takes resources, the threads are mostly 
 sleeping. Increasing the Context switch costs.
 Details: 
 We should change StorageProxy methods to provide a callback which contains 
 the location where the results has to be written. When the response arrive 
 StorageProxy callback can write the results directly into the connection. 
 Timeouts can be handled in the same way.
 Fixing Netty should be trivial with some refactor in the storage proxy 
 (currently it is one method call for sending the request and waiting) we need 
 callback.
 Fixing Thrift may be harder because thrift calls the method and expects a 
 return value. We might need to write a custom Codec on Netty for thrift 
 support, which can potentially do callbacks (A Custom codec may be similar to 
 http://engineering.twitter.com/2011/04/twitter-search-is-now-3x-faster_1656.html
  but we dont know details about it). Another option is to update thrift to 
 have a callback.
 FYI, The motivation for this ticket is from another project which i am 
 working on with similar Proxy (blocking Netty transport) and making it Async 
 gave us 2x throughput improvement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5019) Still too much object allocation on reads

2013-05-27 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667825#comment-13667825
 ] 

Jonathan Ellis commented on CASSANDRA-5019:
---

What makes this a bitch is that we ask our Iterators for Column objects, which 
we then add to the ColumnFamily.  So we can't just drop in a FlyweightColumns 
CF subclass; the damage is already done by then.

We'd have to push the CF into OnDiskAtomIterator and have it call add(name, 
value, timestamp) to avoid the Column construction overhead.

 Still too much object allocation on reads
 -

 Key: CASSANDRA-5019
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5019
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
 Fix For: 2.1


 ArrayBackedSortedColumns was a step in the right direction but it's still 
 relatively heavyweight thanks to allocating individual Columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5536) ColumnFamilyInputFormat demands OrderPreservingPartitioner when specifying InputRange with tokens

2013-05-27 Thread Alex Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667826#comment-13667826
 ] 

Alex Liu commented on CASSANDRA-5536:
-

It looks good.

 ColumnFamilyInputFormat demands OrderPreservingPartitioner when specifying 
 InputRange with tokens
 -

 Key: CASSANDRA-5536
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5536
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.3
Reporter: Lanny Ripple
Assignee: Jonathan Ellis
 Attachments: 5536-v2.txt, cassandra-1.2.3-5536.txt


 When ColumnFamilyInputFormat starts getting splits (via getSplits(...) 
 [ColumnFamilyInputFormat.java:101]) it checks to see if a `jobKeyRange` has 
 been set.  If it has been set it attempts to set the `jobRange`.  However the 
 if block (ColumnFamilyInputFormat.java:124) looks to see if the `jobKeyRange` 
 has tokens but asserts that the OrderPreservingPartitioner must be in use.
 This if block should be looking for keys (not tokens).  Code further down 
 (ColumnFamilyInputFormat.java:147) already manages the range if tokens are 
 used but can never be reached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[2/3] git commit: Fix InputKeyRange in CFIF patch by Lanny Ripple and jbellis; reviewed by Alex Liu for CASSANDRA-5536

2013-05-27 Thread jbellis
Fix InputKeyRange in CFIF
patch by Lanny Ripple and jbellis; reviewed by Alex Liu for CASSANDRA-5536


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/aaf18bd0
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/aaf18bd0
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/aaf18bd0

Branch: refs/heads/trunk
Commit: aaf18bd08af50bbaae0954d78d5e6cbb684aded9
Parents: e771b07
Author: Jonathan Ellis jbel...@apache.org
Authored: Mon May 27 11:27:52 2013 -0500
Committer: Jonathan Ellis jbel...@apache.org
Committed: Mon May 27 11:27:52 2013 -0500

--
 CHANGES.txt|1 +
 .../cassandra/hadoop/ColumnFamilyInputFormat.java  |   24 ++
 2 files changed, 18 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/aaf18bd0/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index f49a6f7..34e5b52 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 1.2.6
+ * (Hadoop) Fix InputKeyRange in CFIF (CASSANDRA-5536)
  * Fix dealing with ridiculously large max sstable sizes in LCS 
(CASSANDRA-5589)
  * Ignore pre-truncate hints (CASSANDRA-4655)
  * Move System.exit on OOM into a separate thread (CASSANDRA-5273)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/aaf18bd0/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java
--
diff --git a/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java 
b/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java
index 057d46a..e95e7ad 100644
--- a/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java
+++ b/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java
@@ -121,14 +121,24 @@ public class ColumnFamilyInputFormat extends 
InputFormatByteBuffer, SortedMapB
 ListFutureListInputSplit splitfutures = new 
ArrayListFutureListInputSplit();
 KeyRange jobKeyRange = ConfigHelper.getInputKeyRange(conf);
 RangeToken jobRange = null;
-if (jobKeyRange != null  jobKeyRange.start_token != null)
+if (jobKeyRange != null)
 {
-assert partitioner.preservesOrder() : 
ConfigHelper.setInputKeyRange(..) can only be used with a order preserving 
paritioner;
-assert jobKeyRange.start_key == null : only start_token 
supported;
-assert jobKeyRange.end_key == null : only end_token 
supported;
-jobRange = new 
RangeToken(partitioner.getTokenFactory().fromString(jobKeyRange.start_token),
-
partitioner.getTokenFactory().fromString(jobKeyRange.end_token),
-partitioner);
+if (jobKeyRange.start_key == null)
+{
+logger.warn(ignoring jobKeyRange specified without 
start_key);
+}
+else
+{
+if (!partitioner.preservesOrder())
+throw new UnsupportedOperationException(KeyRange 
based on keys can only be used with a order preserving paritioner);
+if (jobKeyRange.start_token != null)
+throw new IllegalArgumentException(only start_key 
supported);
+if (jobKeyRange.end_token != null)
+throw new IllegalArgumentException(only start_key 
supported);
+jobRange = new 
RangeToken(partitioner.getToken(jobKeyRange.start_key),
+
partitioner.getToken(jobKeyRange.end_key),
+partitioner);
+}
 }
 
 for (TokenRange range : masterRangeNodes)



[3/3] git commit: Merge branch 'cassandra-1.2' into trunk

2013-05-27 Thread jbellis
Merge branch 'cassandra-1.2' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0387cf58
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0387cf58
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0387cf58

Branch: refs/heads/trunk
Commit: 0387cf587965e2c5f04218c7ec1555fd136ba393
Parents: 4e2d76b aaf18bd
Author: Jonathan Ellis jbel...@apache.org
Authored: Mon May 27 11:27:59 2013 -0500
Committer: Jonathan Ellis jbel...@apache.org
Committed: Mon May 27 11:27:59 2013 -0500

--
 CHANGES.txt|1 +
 .../cassandra/hadoop/ColumnFamilyInputFormat.java  |   24 ++
 2 files changed, 18 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/0387cf58/CHANGES.txt
--
diff --cc CHANGES.txt
index 47f6f0a,34e5b52..0ce9e63
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,60 -1,5 +1,61 @@@
 +2.0
 + * Removed on-heap row cache (CASSANDRA-5348)
 + * use nanotime consistently for node-local timeouts (CASSANDRA-5581)
 + * Avoid unnecessary second pass on name-based queries (CASSANDRA-5577)
 + * Experimental triggers (CASSANDRA-1311)
 + * JEMalloc support for off-heap allocation (CASSANDRA-3997)
 + * Single-pass compaction (CASSANDRA-4180)
 + * Removed token range bisection (CASSANDRA-5518)
 + * Removed compatibility with pre-1.2.5 sstables and network messages
 +   (CASSANDRA-5511)
 + * removed PBSPredictor (CASSANDRA-5455)
 + * CAS support (CASSANDRA-5062, 5441, 5443)
 + * Leveled compaction performs size-tiered compactions in L0 
 +   (CASSANDRA-5371, 5439)
 + * Add yaml network topology snitch for mixed ec2/other envs (CASSANDRA-5339)
 + * Log when a node is down longer than the hint window (CASSANDRA-4554)
 + * Optimize tombstone creation for ExpiringColumns (CASSANDRA-4917)
 + * Improve LeveledScanner work estimation (CASSANDRA-5250, 5407)
 + * Replace compaction lock with runWithCompactionsDisabled (CASSANDRA-3430)
 + * Change Message IDs to ints (CASSANDRA-5307)
 + * Move sstable level information into the Stats component, removing the
 +   need for a separate Manifest file (CASSANDRA-4872)
 + * avoid serializing to byte[] on commitlog append (CASSANDRA-5199)
 + * make index_interval configurable per columnfamily (CASSANDRA-3961)
 + * add default_time_to_live (CASSANDRA-3974)
 + * add memtable_flush_period_in_ms (CASSANDRA-4237)
 + * replace supercolumns internally by composites (CASSANDRA-3237, 5123)
 + * upgrade thrift to 0.9.0 (CASSANDRA-3719)
 + * drop unnecessary keyspace parameter from user-defined compaction API 
 +   (CASSANDRA-5139)
 + * more robust solution to incomplete compactions + counters (CASSANDRA-5151)
 + * Change order of directory searching for c*.in.sh (CASSANDRA-3983)
 + * Add tool to reset SSTable compaction level for LCS (CASSANDRA-5271)
 + * Allow custom configuration loader (CASSANDRA-5045)
 + * Remove memory emergency pressure valve logic (CASSANDRA-3534)
 + * Reduce request latency with eager retry (CASSANDRA-4705)
 + * cqlsh: Remove ASSUME command (CASSANDRA-5331)
 + * Rebuild BF when loading sstables if bloom_filter_fp_chance
 +   has changed since compaction (CASSANDRA-5015)
 + * remove row-level bloom filters (CASSANDRA-4885)
 + * Change Kernel Page Cache skipping into row preheating (disabled by default)
 +   (CASSANDRA-4937)
 + * Improve repair by deciding on a gcBefore before sending
 +   out TreeRequests (CASSANDRA-4932)
 + * Add an official way to disable compactions (CASSANDRA-5074)
 + * Reenable ALTER TABLE DROP with new semantics (CASSANDRA-3919)
 + * Add binary protocol versioning (CASSANDRA-5436)
 + * Swap THshaServer for TThreadedSelectorServer (CASSANDRA-5530)
 + * Add alias support to SELECT statement (CASSANDRA-5075)
 + * Don't create empty RowMutations in CommitLogReplayer (CASSANDRA-5541)
 + * Use range tombstones when dropping cfs/columns from schema (CASSANDRA-5579)
 + * cqlsh: drop CQL2/CQL3-beta support (CASSANDRA-5585)
 + * Track max/min column names in sstables to be able to optimize slice
 +   queries (CASSANDRA-5514)
 + * Binary protocol: allow batching already prepared statements 
(CASSANDRA-4693)
 +
  1.2.6
+  * (Hadoop) Fix InputKeyRange in CFIF (CASSANDRA-5536)
   * Fix dealing with ridiculously large max sstable sizes in LCS 
(CASSANDRA-5589)
   * Ignore pre-truncate hints (CASSANDRA-4655)
   * Move System.exit on OOM into a separate thread (CASSANDRA-5273)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/0387cf58/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java
--



[1/3] git commit: Fix InputKeyRange in CFIF patch by Lanny Ripple and jbellis; reviewed by Alex Liu for CASSANDRA-5536

2013-05-27 Thread jbellis
Updated Branches:
  refs/heads/cassandra-1.2 e771b0795 - aaf18bd08
  refs/heads/trunk 4e2d76b8c - 0387cf587


Fix InputKeyRange in CFIF
patch by Lanny Ripple and jbellis; reviewed by Alex Liu for CASSANDRA-5536


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/aaf18bd0
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/aaf18bd0
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/aaf18bd0

Branch: refs/heads/cassandra-1.2
Commit: aaf18bd08af50bbaae0954d78d5e6cbb684aded9
Parents: e771b07
Author: Jonathan Ellis jbel...@apache.org
Authored: Mon May 27 11:27:52 2013 -0500
Committer: Jonathan Ellis jbel...@apache.org
Committed: Mon May 27 11:27:52 2013 -0500

--
 CHANGES.txt|1 +
 .../cassandra/hadoop/ColumnFamilyInputFormat.java  |   24 ++
 2 files changed, 18 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/aaf18bd0/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index f49a6f7..34e5b52 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 1.2.6
+ * (Hadoop) Fix InputKeyRange in CFIF (CASSANDRA-5536)
  * Fix dealing with ridiculously large max sstable sizes in LCS 
(CASSANDRA-5589)
  * Ignore pre-truncate hints (CASSANDRA-4655)
  * Move System.exit on OOM into a separate thread (CASSANDRA-5273)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/aaf18bd0/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java
--
diff --git a/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java 
b/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java
index 057d46a..e95e7ad 100644
--- a/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java
+++ b/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java
@@ -121,14 +121,24 @@ public class ColumnFamilyInputFormat extends 
InputFormatByteBuffer, SortedMapB
 ListFutureListInputSplit splitfutures = new 
ArrayListFutureListInputSplit();
 KeyRange jobKeyRange = ConfigHelper.getInputKeyRange(conf);
 RangeToken jobRange = null;
-if (jobKeyRange != null  jobKeyRange.start_token != null)
+if (jobKeyRange != null)
 {
-assert partitioner.preservesOrder() : 
ConfigHelper.setInputKeyRange(..) can only be used with a order preserving 
paritioner;
-assert jobKeyRange.start_key == null : only start_token 
supported;
-assert jobKeyRange.end_key == null : only end_token 
supported;
-jobRange = new 
RangeToken(partitioner.getTokenFactory().fromString(jobKeyRange.start_token),
-
partitioner.getTokenFactory().fromString(jobKeyRange.end_token),
-partitioner);
+if (jobKeyRange.start_key == null)
+{
+logger.warn(ignoring jobKeyRange specified without 
start_key);
+}
+else
+{
+if (!partitioner.preservesOrder())
+throw new UnsupportedOperationException(KeyRange 
based on keys can only be used with a order preserving paritioner);
+if (jobKeyRange.start_token != null)
+throw new IllegalArgumentException(only start_key 
supported);
+if (jobKeyRange.end_token != null)
+throw new IllegalArgumentException(only start_key 
supported);
+jobRange = new 
RangeToken(partitioner.getToken(jobKeyRange.start_key),
+
partitioner.getToken(jobKeyRange.end_key),
+partitioner);
+}
 }
 
 for (TokenRange range : masterRangeNodes)



[jira] [Updated] (CASSANDRA-5536) ColumnFamilyInputFormat demands OrderPreservingPartitioner when specifying InputRange with tokens

2013-05-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-5536:
--

Labels: hadoop  (was: )

 ColumnFamilyInputFormat demands OrderPreservingPartitioner when specifying 
 InputRange with tokens
 -

 Key: CASSANDRA-5536
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5536
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.3
Reporter: Lanny Ripple
Assignee: Jonathan Ellis
  Labels: hadoop
 Fix For: 1.2.6

 Attachments: 5536-v2.txt, cassandra-1.2.3-5536.txt


 When ColumnFamilyInputFormat starts getting splits (via getSplits(...) 
 [ColumnFamilyInputFormat.java:101]) it checks to see if a `jobKeyRange` has 
 been set.  If it has been set it attempts to set the `jobRange`.  However the 
 if block (ColumnFamilyInputFormat.java:124) looks to see if the `jobKeyRange` 
 has tokens but asserts that the OrderPreservingPartitioner must be in use.
 This if block should be looking for keys (not tokens).  Code further down 
 (ColumnFamilyInputFormat.java:147) already manages the range if tokens are 
 used but can never be reached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (CASSANDRA-5388) Unit tests fail due to ant/junit problem

2013-05-27 Thread Ryan McGuire (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McGuire resolved CASSANDRA-5388.
-

Resolution: Fixed

Tested with ant 1.9.1 - issue resolved!

 Unit tests fail due to ant/junit problem
 

 Key: CASSANDRA-5388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5388
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 2.0
 Environment: Windows 7 or Linux
 java 1.7.0_17
 ant 1.9.0
Reporter: Ryan McGuire

 Intermittently, but more often than not I get the following error when 
 running 'ant test' on Windows 7 (also encountered on Linux now):
 {code}
 BUILD FAILED
 c:\Users\Ryan\git\cassandra3\build.xml:1121: The following error occurred 
 while executing this line:
 c:\Users\Ryan\git\cassandra3\build.xml:1064: Using loader 
 AntClassLoader[C:\Program 
 Files\Java\apache-ant-1.9.0\lib\ant-launcher.jar;c:\Program 
 Files\Java\apache-ant-1.9.0\lib\ant.jar;c:\Program 
 Files\Java\apache-ant-1.9.0\lib\ant-junit.jar;c:\Program 
 

[jira] [Assigned] (CASSANDRA-5543) Ant issues when building gen-cql2-grammar

2013-05-27 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams reassigned CASSANDRA-5543:
---

Assignee: Dave Brosius

Can you take a look, Dave?

 Ant issues when building gen-cql2-grammar
 -

 Key: CASSANDRA-5543
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5543
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.2.3
Reporter: Joaquin Casares
Assignee: Dave Brosius
Priority: Trivial

 Below are the commands and outputs that were returned.
 The first `ant` command fails on gen-cql2-grammar, but if I don't run `ant 
 realclean` then it works fine after a second pass.
 {CODE}
 ubuntu@ip-10-196-153-29:~/.ccm/repository/1.2.3$ ant realclean
 Buildfile: /home/ubuntu/.ccm/repository/1.2.3/build.xml
 clean:
[delete] Deleting directory /home/ubuntu/.ccm/repository/1.2.3/build/test
[delete] Deleting directory 
 /home/ubuntu/.ccm/repository/1.2.3/build/classes
[delete] Deleting directory /home/ubuntu/.ccm/repository/1.2.3/src/gen-java
[delete] Deleting: /home/ubuntu/.ccm/repository/1.2.3/build/internode.avpr
 realclean:
[delete] Deleting directory /home/ubuntu/.ccm/repository/1.2.3/build
 BUILD SUCCESSFUL
 Total time: 0 seconds
 {CODE}
 {CODE}
 ubuntu@ip-10-196-153-29:~/.ccm/repository/1.2.3$ ant
 Buildfile: /home/ubuntu/.ccm/repository/1.2.3/build.xml
 maven-ant-tasks-localrepo:
 maven-ant-tasks-download:
  [echo] Downloading Maven ANT Tasks...
 [mkdir] Created dir: /home/ubuntu/.ccm/repository/1.2.3/build
   [get] Getting: 
 http://repo2.maven.org/maven2/org/apache/maven/maven-ant-tasks/2.1.3/maven-ant-tasks-2.1.3.jar
   [get] To: 
 /home/ubuntu/.ccm/repository/1.2.3/build/maven-ant-tasks-2.1.3.jar
 maven-ant-tasks-init:
 [mkdir] Created dir: /home/ubuntu/.ccm/repository/1.2.3/build/lib
 maven-declare-dependencies:
 maven-ant-tasks-retrieve-build:
 [artifact:dependencies] Downloading: asm/asm/3.2/asm-3.2-sources.jar from 
 repository central at http://repo1.maven.org/maven2
 
 [artifact:dependencies] [INFO] Unable to find resource 
 'hsqldb:hsqldb:java-source:sources:1.8.0.10' in repository java.net2 
 (http://download.java.net/maven/2)
 [artifact:dependencies] Building ant file: 
 /home/ubuntu/.ccm/repository/1.2.3/build/build-dependencies.xml
  [copy] Copying 45 files to 
 /home/ubuntu/.ccm/repository/1.2.3/build/lib/jars
  [copy] Copying 35 files to 
 /home/ubuntu/.ccm/repository/1.2.3/build/lib/sources
 init:
 [mkdir] Created dir: /home/ubuntu/.ccm/repository/1.2.3/build/classes/main
 [mkdir] Created dir: 
 /home/ubuntu/.ccm/repository/1.2.3/build/classes/thrift
 [mkdir] Created dir: /home/ubuntu/.ccm/repository/1.2.3/build/test/lib
 [mkdir] Created dir: /home/ubuntu/.ccm/repository/1.2.3/build/test/classes
 [mkdir] Created dir: /home/ubuntu/.ccm/repository/1.2.3/src/gen-java
 check-avro-generate:
 avro-interface-generate-internode:
  [echo] Generating Avro internode code...
 avro-generate:
 build-subprojects:
 check-gen-cli-grammar:
 gen-cli-grammar:
  [echo] Building Grammar 
 /home/ubuntu/.ccm/repository/1.2.3/src/java/org/apache/cassandra/cli/Cli.g  
 
 check-gen-cql2-grammar:
 gen-cql2-grammar:
  [echo] Building Grammar 
 /home/ubuntu/.ccm/repository/1.2.3/src/java/org/apache/cassandra/cql/Cql.g  
 ...
  [java] warning(200): 
 /home/ubuntu/.ccm/repository/1.2.3/src/java/org/apache/cassandra/cql/Cql.g:479:20:
  Decision can match input such as IDENT using multiple alternatives: 1, 2
  [java] As a result, alternative(s) 2 were disabled for that input
  [java] warning(200): 
 /home/ubuntu/.ccm/repository/1.2.3/src/java/org/apache/cassandra/cql/Cql.g:479:20:
  Decision can match input such as K_KEY using multiple alternatives: 1, 2
  [java] As a result, alternative(s) 2 were disabled for that input
  [java] warning(200): 
 /home/ubuntu/.ccm/repository/1.2.3/src/java/org/apache/cassandra/cql/Cql.g:479:20:
  Decision can match input such as QMARK using multiple alternatives: 1, 2
  [java] As a result, alternative(s) 2 were disabled for that input
  [java] warning(200): 
 /home/ubuntu/.ccm/repository/1.2.3/src/java/org/apache/cassandra/cql/Cql.g:479:20:
  Decision can match input such as FLOAT using multiple alternatives: 1, 2
  [java] As a result, alternative(s) 2 were disabled for that input
  [java] warning(200): 
 /home/ubuntu/.ccm/repository/1.2.3/src/java/org/apache/cassandra/cql/Cql.g:479:20:
  Decision can match input such as STRING_LITERAL using multiple 
 alternatives: 1, 2
  [java] As a result, alternative(s) 2 were disabled for that input
  [java] warning(200): 
 /home/ubuntu/.ccm/repository/1.2.3/src/java/org/apache/cassandra/cql/Cql.g:479:20:
  Decision can match input such as 

[jira] [Assigned] (CASSANDRA-5544) Hadoop jobs assigns only one mapper in task

2013-05-27 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams reassigned CASSANDRA-5544:
---

Assignee: Alex Liu  (was: Brandon Williams)

Can you take a look, Alex?  Nothing changed in pig as far I know.

 Hadoop jobs assigns only one mapper in task 
 

 Key: CASSANDRA-5544
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5544
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.1
 Environment: Red hat linux 5.4, Hadoop 1.0.3, pig 0.11.1
Reporter: Shamim Ahmed
Assignee: Alex Liu
 Attachments: Screen Shot 2013-05-26 at 4.49.48 PM.png


 We have got very strange beheviour of hadoop cluster after upgrading 
 Cassandra from 1.1.5 to Cassandra 1.2.1. We have 5 nodes cluster of 
 Cassandra, where three of them are hodoop slaves. Now when we are submitting 
 job through Pig script, only one map assigns in task running on one of the 
 hadoop slaves regardless of 
 volume of data (already tried with more than million rows).
 Configure of pig as follows:
 export PIG_HOME=/oracle/pig-0.10.0
 export PIG_CONF_DIR=${HADOOP_HOME}/conf
 export PIG_INITIAL_ADDRESS=192.168.157.103
 export PIG_RPC_PORT=9160
 export PIG_PARTITIONER=org.apache.cassandra.dht.Murmur3Partitioner
 Also we have these following properties in hadoop:
  property
  namemapred.tasktracker.map.tasks.maximum/name
  value10/value
  /property
  property
  namemapred.map.tasks/name
  value4/value
  /property

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

2013-05-27 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667902#comment-13667902
 ] 

Mck SembWever commented on CASSANDRA-2388:
--

{quote}The biggest problem is [avoiding endpoints in a different DC]. Maybe the 
way todo this is change getSplits logic to never return replicas in another DC. 
I think this would require adding DC info to the describe_ring call{quote}

Tasktrackers may have access to a set of datacenters, so this DC info needs 
contain a list of DCs.

For example, our setup separates datacenters by physical datacenter and 
hadoop-usage, like:{noformat}DC1 Production + Hadoop
  c*01 c*03
DC2 Production + Hadoop
  c*02 c*04
DC3 Production
  c*05
DC4 Production
  c*06{noformat}

So here we'd pass to getSplits() a DC info like DC1,DC2.
But the problem remain, given a task executing on c*01 that fails to connect to 
localhost, although we can now prevent a connection to DC3 or DC4, we can't 
favour a connection to any other split in DC1 over anything in DC2. Is this 
solvable? 

 ColumnFamilyRecordReader fails for a given split because a host is down, even 
 if records could reasonably be read from other replica.
 -

 Key: CASSANDRA-2388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.6
Reporter: Eldon Stegall
Assignee: Mck SembWever
Priority: Minor
  Labels: hadoop, inputformat
 Fix For: 1.2.6

 Attachments: 0002_On_TException_try_next_split.patch, 
 CASSANDRA-2388-addition1.patch, CASSANDRA-2388-extended.patch, 
 CASSANDRA-2388.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, 
 CASSANDRA-2388.patch


 ColumnFamilyRecordReader only tries the first location for a given split. We 
 should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5442) Add support for specifying CAS commit CL

2013-05-27 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-5442:
-

Attachment: 5442-rebased.txt

 Add support for specifying CAS commit CL
 

 Key: CASSANDRA-5442
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5442
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
 Fix For: 2.0

 Attachments: 5442-rebased.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5442) Add support for specifying CAS commit CL

2013-05-27 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667910#comment-13667910
 ] 

Aleksey Yeschenko commented on CASSANDRA-5442:
--

LGTM (attached the rebased patch, too).

 Add support for specifying CAS commit CL
 

 Key: CASSANDRA-5442
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5442
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
 Fix For: 2.0

 Attachments: 5442-rebased.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


git commit: Add support for specifying CAS commit ConsistencyLevel patch by jbellis; reviewed by aleksey for CASSANDRA-5442

2013-05-27 Thread jbellis
Updated Branches:
  refs/heads/trunk 0387cf587 - bc3597d35


Add support for specifying CAS commit ConsistencyLevel
patch by jbellis; reviewed by aleksey for CASSANDRA-5442


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/bc3597d3
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/bc3597d3
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/bc3597d3

Branch: refs/heads/trunk
Commit: bc3597d3549850997fd137cc8b74700c62cebf64
Parents: 0387cf5
Author: Jonathan Ellis jbel...@apache.org
Authored: Mon May 27 14:31:44 2013 -0500
Committer: Jonathan Ellis jbel...@apache.org
Committed: Mon May 27 14:31:44 2013 -0500

--
 CHANGES.txt|2 +-
 interface/cassandra.thrift |   14 +-
 .../org/apache/cassandra/thrift/Cassandra.java |  147 +-
 .../apache/cassandra/thrift/TimedOutException.java |  136 +-
 .../cql3/statements/ModificationStatement.java |2 +-
 .../org/apache/cassandra/db/ConsistencyLevel.java  |   13 ++
 .../org/apache/cassandra/db/WriteResponse.java |4 +-
 .../org/apache/cassandra/service/StorageProxy.java |   54 +-
 .../cassandra/service/paxos/CommitVerbHandler.java |7 +
 .../apache/cassandra/thrift/CassandraServer.java   |4 +-
 .../apache/cassandra/thrift/ThriftConversion.java  |2 +
 test/system/test_thrift_server.py  |9 +-
 12 files changed, 354 insertions(+), 40 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/bc3597d3/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 0ce9e63..e233ba0 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -9,7 +9,7 @@
  * Removed compatibility with pre-1.2.5 sstables and network messages
(CASSANDRA-5511)
  * removed PBSPredictor (CASSANDRA-5455)
- * CAS support (CASSANDRA-5062, 5441, 5443)
+ * CAS support (CASSANDRA-5062, 5441, 5442, 5443)
  * Leveled compaction performs size-tiered compactions in L0 
(CASSANDRA-5371, 5439)
  * Add yaml network topology snitch for mixed ec2/other envs (CASSANDRA-5339)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/bc3597d3/interface/cassandra.thrift
--
diff --git a/interface/cassandra.thrift b/interface/cassandra.thrift
index 1e78d51..cfdaf88 100644
--- a/interface/cassandra.thrift
+++ b/interface/cassandra.thrift
@@ -148,10 +148,17 @@ exception TimedOutException {
  */
 1: optional i32 acknowledged_by
 
-/**
- * in case of atomic_batch_mutate method this field tells if the batch was 
written to the batchlog.
+/** 
+ * in case of atomic_batch_mutate method this field tells if the batch 
+ * was written to the batchlog.  
  */
 2: optional bool acknowledged_by_batchlog
+
+/** 
+ * for the CAS method, this field tells if we timed out during the paxos
+ * protocol, as opposed to during the commit of our update
+ */
+3: optional bool paxos_in_progress
 }
 
 /** invalid authentication request (invalid keyspace, user does not exist, or 
credentials invalid) */
@@ -643,7 +650,8 @@ service Cassandra {
   bool cas(1:required binary key, 
2:required string column_family,
3:listColumn expected,
-   4:listColumn updates)
+   4:listColumn updates,
+   5:required ConsistencyLevel 
consistency_level=ConsistencyLevel.QUORUM)
throws (1:InvalidRequestException ire, 2:UnavailableException ue, 
3:TimedOutException te),
 
   /**

http://git-wip-us.apache.org/repos/asf/cassandra/blob/bc3597d3/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
--
diff --git 
a/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java 
b/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
index 34aec98..a2761ca 100644
--- a/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
+++ b/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
@@ -170,8 +170,9 @@ public class Cassandra {
  * @param column_family
  * @param expected
  * @param updates
+ * @param consistency_level
  */
-public boolean cas(ByteBuffer key, String column_family, ListColumn 
expected, ListColumn updates) throws InvalidRequestException, 
UnavailableException, TimedOutException, org.apache.thrift.TException;
+public boolean cas(ByteBuffer key, String column_family, ListColumn 
expected, ListColumn updates, ConsistencyLevel consistency_level) throws 
InvalidRequestException, UnavailableException, TimedOutException, 
org.apache.thrift.TException;
 
 /**
  * Remove data from the row 

[jira] [Commented] (CASSANDRA-5442) Add support for specifying CAS commit CL

2013-05-27 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667919#comment-13667919
 ] 

Jonathan Ellis commented on CASSANDRA-5442:
---

committed, thanks!

 Add support for specifying CAS commit CL
 

 Key: CASSANDRA-5442
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5442
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
 Fix For: 2.0

 Attachments: 5442-rebased.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5544) Hadoop jobs assigns only one mapper in task

2013-05-27 Thread Alex Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667971#comment-13667971
 ] 

Alex Liu commented on CASSANDRA-5544:
-

[~shamim] How many splits do you get for each hadoop node? You can set 
ConfigHelper.setInputSplitSize to a smaller number to get more mappers for your 
pig job. The existing CassandraStorage class doesn't set it, so it uses the 
defualt value of 64k. So if your nodes has less than 64k rows, it will have 
only one mapper.

 Hadoop jobs assigns only one mapper in task 
 

 Key: CASSANDRA-5544
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5544
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.1
 Environment: Red hat linux 5.4, Hadoop 1.0.3, pig 0.11.1
Reporter: Shamim Ahmed
Assignee: Alex Liu
 Attachments: Screen Shot 2013-05-26 at 4.49.48 PM.png


 We have got very strange beheviour of hadoop cluster after upgrading 
 Cassandra from 1.1.5 to Cassandra 1.2.1. We have 5 nodes cluster of 
 Cassandra, where three of them are hodoop slaves. Now when we are submitting 
 job through Pig script, only one map assigns in task running on one of the 
 hadoop slaves regardless of 
 volume of data (already tried with more than million rows).
 Configure of pig as follows:
 export PIG_HOME=/oracle/pig-0.10.0
 export PIG_CONF_DIR=${HADOOP_HOME}/conf
 export PIG_INITIAL_ADDRESS=192.168.157.103
 export PIG_RPC_PORT=9160
 export PIG_PARTITIONER=org.apache.cassandra.dht.Murmur3Partitioner
 Also we have these following properties in hadoop:
  property
  namemapred.tasktracker.map.tasks.maximum/name
  value10/value
  /property
  property
  namemapred.map.tasks/name
  value4/value
  /property

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5544) Hadoop jobs assigns only one mapper in task

2013-05-27 Thread Alex Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667973#comment-13667973
 ] 

Alex Liu commented on CASSANDRA-5544:
-

Some changes had been made to CassandraColumnInputFormat class since 1.1.5

e.g.
add describe_splits_ex providing improved split size estimate
patch by Piotr Kolaczkowski; reviewed by jbellis for CASSANDRA-4803

 Hadoop jobs assigns only one mapper in task 
 

 Key: CASSANDRA-5544
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5544
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.1
 Environment: Red hat linux 5.4, Hadoop 1.0.3, pig 0.11.1
Reporter: Shamim Ahmed
Assignee: Alex Liu
 Attachments: Screen Shot 2013-05-26 at 4.49.48 PM.png


 We have got very strange beheviour of hadoop cluster after upgrading 
 Cassandra from 1.1.5 to Cassandra 1.2.1. We have 5 nodes cluster of 
 Cassandra, where three of them are hodoop slaves. Now when we are submitting 
 job through Pig script, only one map assigns in task running on one of the 
 hadoop slaves regardless of 
 volume of data (already tried with more than million rows).
 Configure of pig as follows:
 export PIG_HOME=/oracle/pig-0.10.0
 export PIG_CONF_DIR=${HADOOP_HOME}/conf
 export PIG_INITIAL_ADDRESS=192.168.157.103
 export PIG_RPC_PORT=9160
 export PIG_PARTITIONER=org.apache.cassandra.dht.Murmur3Partitioner
 Also we have these following properties in hadoop:
  property
  namemapred.tasktracker.map.tasks.maximum/name
  value10/value
  /property
  property
  namemapred.map.tasks/name
  value4/value
  /property

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5544) Hadoop jobs assigns only one mapper in task

2013-05-27 Thread Cyril Scetbon (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667978#comment-13667978
 ] 

Cyril Scetbon commented on CASSANDRA-5544:
--

[~alexliu68] I did some tests with more than 64k row and had only one mapper 
for the whole cluster. Even if we have less than 64k rows, why don't we have at 
least one mapper per node (in my case replication_factor=1) to work on rows 
using data locality. Vnodes are enabled on my cluster, can there be a relation 
with this option ?

 Hadoop jobs assigns only one mapper in task 
 

 Key: CASSANDRA-5544
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5544
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.1
 Environment: Red hat linux 5.4, Hadoop 1.0.3, pig 0.11.1
Reporter: Shamim Ahmed
Assignee: Alex Liu
 Attachments: Screen Shot 2013-05-26 at 4.49.48 PM.png


 We have got very strange beheviour of hadoop cluster after upgrading 
 Cassandra from 1.1.5 to Cassandra 1.2.1. We have 5 nodes cluster of 
 Cassandra, where three of them are hodoop slaves. Now when we are submitting 
 job through Pig script, only one map assigns in task running on one of the 
 hadoop slaves regardless of 
 volume of data (already tried with more than million rows).
 Configure of pig as follows:
 export PIG_HOME=/oracle/pig-0.10.0
 export PIG_CONF_DIR=${HADOOP_HOME}/conf
 export PIG_INITIAL_ADDRESS=192.168.157.103
 export PIG_RPC_PORT=9160
 export PIG_PARTITIONER=org.apache.cassandra.dht.Murmur3Partitioner
 Also we have these following properties in hadoop:
  property
  namemapred.tasktracker.map.tasks.maximum/name
  value10/value
  /property
  property
  namemapred.map.tasks/name
  value4/value
  /property

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5544) Hadoop jobs assigns only one mapper in task

2013-05-27 Thread Alex Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667980#comment-13667980
 ] 

Alex Liu commented on CASSANDRA-5544:
-

Yes, if vnode is enale, it creates a lot of smaller splits (which is not 
preferred, we will fix the vnode hadoop too many small splits issue later), so 
can you test it with vnode disable.

 Hadoop jobs assigns only one mapper in task 
 

 Key: CASSANDRA-5544
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5544
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.1
 Environment: Red hat linux 5.4, Hadoop 1.0.3, pig 0.11.1
Reporter: Shamim Ahmed
Assignee: Alex Liu
 Attachments: Screen Shot 2013-05-26 at 4.49.48 PM.png


 We have got very strange beheviour of hadoop cluster after upgrading 
 Cassandra from 1.1.5 to Cassandra 1.2.1. We have 5 nodes cluster of 
 Cassandra, where three of them are hodoop slaves. Now when we are submitting 
 job through Pig script, only one map assigns in task running on one of the 
 hadoop slaves regardless of 
 volume of data (already tried with more than million rows).
 Configure of pig as follows:
 export PIG_HOME=/oracle/pig-0.10.0
 export PIG_CONF_DIR=${HADOOP_HOME}/conf
 export PIG_INITIAL_ADDRESS=192.168.157.103
 export PIG_RPC_PORT=9160
 export PIG_PARTITIONER=org.apache.cassandra.dht.Murmur3Partitioner
 Also we have these following properties in hadoop:
  property
  namemapred.tasktracker.map.tasks.maximum/name
  value10/value
  /property
  property
  namemapred.map.tasks/name
  value4/value
  /property

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5544) Hadoop jobs assigns only one mapper in task

2013-05-27 Thread Cyril Scetbon (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667982#comment-13667982
 ] 

Cyril Scetbon commented on CASSANDRA-5544:
--

But if there are many small splits it doesn't mean that we should have more 
mappers ? I'm saying that cause you propose to [~shamim_ru] to decrease 
ConfigHelper.setInputSplitSize exactly for that, right ?
I need one more day to test without vnodes.

 Hadoop jobs assigns only one mapper in task 
 

 Key: CASSANDRA-5544
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5544
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.1
 Environment: Red hat linux 5.4, Hadoop 1.0.3, pig 0.11.1
Reporter: Shamim Ahmed
Assignee: Alex Liu
 Attachments: Screen Shot 2013-05-26 at 4.49.48 PM.png


 We have got very strange beheviour of hadoop cluster after upgrading 
 Cassandra from 1.1.5 to Cassandra 1.2.1. We have 5 nodes cluster of 
 Cassandra, where three of them are hodoop slaves. Now when we are submitting 
 job through Pig script, only one map assigns in task running on one of the 
 hadoop slaves regardless of 
 volume of data (already tried with more than million rows).
 Configure of pig as follows:
 export PIG_HOME=/oracle/pig-0.10.0
 export PIG_CONF_DIR=${HADOOP_HOME}/conf
 export PIG_INITIAL_ADDRESS=192.168.157.103
 export PIG_RPC_PORT=9160
 export PIG_PARTITIONER=org.apache.cassandra.dht.Murmur3Partitioner
 Also we have these following properties in hadoop:
  property
  namemapred.tasktracker.map.tasks.maximum/name
  value10/value
  /property
  property
  namemapred.map.tasks/name
  value4/value
  /property

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5544) Hadoop jobs assigns only one mapper in task

2013-05-27 Thread Alex Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667984#comment-13667984
 ] 

Alex Liu commented on CASSANDRA-5544:
-

Current implementation only matches one mapper to a split. Existing code 
doesn't set InputSplitSize (which means we can't change it to a smaller number 
unless we change the code at setLocation method to do it), so we need more than 
64k rows to have more than one mapper per node.

For vnode we need to support a virtual split which combines multiple small 
splits. 

 Hadoop jobs assigns only one mapper in task 
 

 Key: CASSANDRA-5544
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5544
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.1
 Environment: Red hat linux 5.4, Hadoop 1.0.3, pig 0.11.1
Reporter: Shamim Ahmed
Assignee: Alex Liu
 Attachments: Screen Shot 2013-05-26 at 4.49.48 PM.png


 We have got very strange beheviour of hadoop cluster after upgrading 
 Cassandra from 1.1.5 to Cassandra 1.2.1. We have 5 nodes cluster of 
 Cassandra, where three of them are hodoop slaves. Now when we are submitting 
 job through Pig script, only one map assigns in task running on one of the 
 hadoop slaves regardless of 
 volume of data (already tried with more than million rows).
 Configure of pig as follows:
 export PIG_HOME=/oracle/pig-0.10.0
 export PIG_CONF_DIR=${HADOOP_HOME}/conf
 export PIG_INITIAL_ADDRESS=192.168.157.103
 export PIG_RPC_PORT=9160
 export PIG_PARTITIONER=org.apache.cassandra.dht.Murmur3Partitioner
 Also we have these following properties in hadoop:
  property
  namemapred.tasktracker.map.tasks.maximum/name
  value10/value
  /property
  property
  namemapred.map.tasks/name
  value4/value
  /property

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5544) Hadoop jobs assigns only one mapper in task

2013-05-27 Thread Cyril Scetbon (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667986#comment-13667986
 ] 

Cyril Scetbon commented on CASSANDRA-5544:
--

okay. I'll test without vnodes and give you a feedback except if [~shamim_ru] 
confirms that it didn't use vnodes, which I suppose as he upgraded from C* 
1.1.5 to 1.2.1

 Hadoop jobs assigns only one mapper in task 
 

 Key: CASSANDRA-5544
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5544
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.1
 Environment: Red hat linux 5.4, Hadoop 1.0.3, pig 0.11.1
Reporter: Shamim Ahmed
Assignee: Alex Liu
 Attachments: Screen Shot 2013-05-26 at 4.49.48 PM.png


 We have got very strange beheviour of hadoop cluster after upgrading 
 Cassandra from 1.1.5 to Cassandra 1.2.1. We have 5 nodes cluster of 
 Cassandra, where three of them are hodoop slaves. Now when we are submitting 
 job through Pig script, only one map assigns in task running on one of the 
 hadoop slaves regardless of 
 volume of data (already tried with more than million rows).
 Configure of pig as follows:
 export PIG_HOME=/oracle/pig-0.10.0
 export PIG_CONF_DIR=${HADOOP_HOME}/conf
 export PIG_INITIAL_ADDRESS=192.168.157.103
 export PIG_RPC_PORT=9160
 export PIG_PARTITIONER=org.apache.cassandra.dht.Murmur3Partitioner
 Also we have these following properties in hadoop:
  property
  namemapred.tasktracker.map.tasks.maximum/name
  value10/value
  /property
  property
  namemapred.map.tasks/name
  value4/value
  /property

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5455) Remove PBSPredictor

2013-05-27 Thread Peter Bailis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13668054#comment-13668054
 ] 

Peter Bailis commented on CASSANDRA-5455:
-

bq. Do we need any core changes at all, then? (Under the #3 for now plan.)

Nope; the predictor I linked uses the per-CF latency metrics. The downside is 
accuracy.


 Remove PBSPredictor
 ---

 Key: CASSANDRA-5455
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5455
 Project: Cassandra
  Issue Type: Bug
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
 Fix For: 2.0

 Attachments: 5455.txt


 It was a fun experiment, but it's unmaintained and the bar to understanding 
 what is going on is high.  Case in point: PBSTest has been failing 
 intermittently for some time now, possibly even since it was created.  Or 
 possibly not and it was a regression from a refactoring we did.  Who knows?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira