[jira] [Commented] (CASSANDRA-2810) RuntimeException in Pig when using dump command on column name

2011-09-28 Thread Steeve Morin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13116818#comment-13116818
 ] 

Steeve Morin commented on CASSANDRA-2810:
-

Fixed it for me on Pig 0.9 and Cassandra 0.8.6 (Brisk).

 RuntimeException in Pig when using dump command on column name
 

 Key: CASSANDRA-2810
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2810
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.1
 Environment: Ubuntu 10.10, 32 bits
 java version 1.6.0_24
 Brisk beta-2 installed from Debian packages
Reporter: Silvère Lestang
Assignee: Brandon Williams
 Attachments: 2810-v2.txt, 2810-v3.txt, 2810.txt


 This bug was previously report on [Brisk bug 
 tracker|https://datastax.jira.com/browse/BRISK-232].
 In cassandra-cli:
 {code}
 [default@unknown] create keyspace Test
 with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy'
 and strategy_options = [{replication_factor:1}];
 [default@unknown] use Test;
 Authenticated to keyspace: Test
 [default@Test] create column family test;
 [default@Test] set test[ascii('row1')][long(1)]=integer(35);
 set test[ascii('row1')][long(2)]=integer(36);
 set test[ascii('row1')][long(3)]=integer(38);
 set test[ascii('row2')][long(1)]=integer(45);
 set test[ascii('row2')][long(2)]=integer(42);
 set test[ascii('row2')][long(3)]=integer(33);
 [default@Test] list test;
 Using default limit of 100
 ---
 RowKey: 726f7731
 = (column=0001, value=35, timestamp=1308744931122000)
 = (column=0002, value=36, timestamp=1308744931124000)
 = (column=0003, value=38, timestamp=1308744931125000)
 ---
 RowKey: 726f7732
 = (column=0001, value=45, timestamp=1308744931127000)
 = (column=0002, value=42, timestamp=1308744931128000)
 = (column=0003, value=33, timestamp=1308744932722000)
 2 Rows Returned.
 [default@Test] describe keyspace;
 Keyspace: Test:
   Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
   Durable Writes: true
 Options: [replication_factor:1]
   Column Families:
 ColumnFamily: test
   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
   Default column value validator: 
 org.apache.cassandra.db.marshal.BytesType
   Columns sorted by: org.apache.cassandra.db.marshal.BytesType
   Row cache size / save period in seconds: 0.0/0
   Key cache size / save period in seconds: 20.0/14400
   Memtable thresholds: 0.571875/122/1440 (millions of ops/MB/minutes)
   GC grace seconds: 864000
   Compaction min/max thresholds: 4/32
   Read repair chance: 1.0
   Replicate on write: false
   Built indexes: []
 {code}
 In Pig command line:
 {code}
 grunt test = LOAD 'cassandra://Test/test' USING CassandraStorage() AS 
 (rowkey:chararray, columns: bag {T: (name:long, value:int)});
 grunt value_test = foreach test generate rowkey, columns.name, columns.value;
 grunt dump value_test;
 {code}
 In /var/log/cassandra/system.log, I have severals time this exception:
 {code}
 INFO [IPC Server handler 3 on 8012] 2011-06-22 15:03:28,533 
 TaskInProgress.java (line 551) Error from 
 attempt_201106210955_0051_m_00_3: java.lang.RuntimeException: Unexpected 
 data type -1 found in stream.
   at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:478)
   at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541)
   at org.apache.pig.data.BinInterSedes.writeBag(BinInterSedes.java:522)
   at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:361)
   at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541)
   at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:357)
   at 
 org.apache.pig.impl.io.InterRecordWriter.write(InterRecordWriter.java:73)
   at org.apache.pig.impl.io.InterStorage.putNext(InterStorage.java:87)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97)
   at 
 org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:638)
   at 
 org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:239)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:232)

[jira] [Commented] (CASSANDRA-2810) RuntimeException in Pig when using dump command on column name

2011-09-28 Thread Jeremy Hanna (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13116825#comment-13116825
 ] 

Jeremy Hanna commented on CASSANDRA-2810:
-

+1 - if we find any issues with it in production, we'll submit bug reports.

 RuntimeException in Pig when using dump command on column name
 

 Key: CASSANDRA-2810
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2810
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.1
 Environment: Ubuntu 10.10, 32 bits
 java version 1.6.0_24
 Brisk beta-2 installed from Debian packages
Reporter: Silvère Lestang
Assignee: Brandon Williams
 Attachments: 2810-v2.txt, 2810-v3.txt, 2810.txt


 This bug was previously report on [Brisk bug 
 tracker|https://datastax.jira.com/browse/BRISK-232].
 In cassandra-cli:
 {code}
 [default@unknown] create keyspace Test
 with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy'
 and strategy_options = [{replication_factor:1}];
 [default@unknown] use Test;
 Authenticated to keyspace: Test
 [default@Test] create column family test;
 [default@Test] set test[ascii('row1')][long(1)]=integer(35);
 set test[ascii('row1')][long(2)]=integer(36);
 set test[ascii('row1')][long(3)]=integer(38);
 set test[ascii('row2')][long(1)]=integer(45);
 set test[ascii('row2')][long(2)]=integer(42);
 set test[ascii('row2')][long(3)]=integer(33);
 [default@Test] list test;
 Using default limit of 100
 ---
 RowKey: 726f7731
 = (column=0001, value=35, timestamp=1308744931122000)
 = (column=0002, value=36, timestamp=1308744931124000)
 = (column=0003, value=38, timestamp=1308744931125000)
 ---
 RowKey: 726f7732
 = (column=0001, value=45, timestamp=1308744931127000)
 = (column=0002, value=42, timestamp=1308744931128000)
 = (column=0003, value=33, timestamp=1308744932722000)
 2 Rows Returned.
 [default@Test] describe keyspace;
 Keyspace: Test:
   Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
   Durable Writes: true
 Options: [replication_factor:1]
   Column Families:
 ColumnFamily: test
   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
   Default column value validator: 
 org.apache.cassandra.db.marshal.BytesType
   Columns sorted by: org.apache.cassandra.db.marshal.BytesType
   Row cache size / save period in seconds: 0.0/0
   Key cache size / save period in seconds: 20.0/14400
   Memtable thresholds: 0.571875/122/1440 (millions of ops/MB/minutes)
   GC grace seconds: 864000
   Compaction min/max thresholds: 4/32
   Read repair chance: 1.0
   Replicate on write: false
   Built indexes: []
 {code}
 In Pig command line:
 {code}
 grunt test = LOAD 'cassandra://Test/test' USING CassandraStorage() AS 
 (rowkey:chararray, columns: bag {T: (name:long, value:int)});
 grunt value_test = foreach test generate rowkey, columns.name, columns.value;
 grunt dump value_test;
 {code}
 In /var/log/cassandra/system.log, I have severals time this exception:
 {code}
 INFO [IPC Server handler 3 on 8012] 2011-06-22 15:03:28,533 
 TaskInProgress.java (line 551) Error from 
 attempt_201106210955_0051_m_00_3: java.lang.RuntimeException: Unexpected 
 data type -1 found in stream.
   at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:478)
   at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541)
   at org.apache.pig.data.BinInterSedes.writeBag(BinInterSedes.java:522)
   at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:361)
   at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541)
   at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:357)
   at 
 org.apache.pig.impl.io.InterRecordWriter.write(InterRecordWriter.java:73)
   at org.apache.pig.impl.io.InterStorage.putNext(InterStorage.java:87)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97)
   at 
 org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:638)
   at 
 org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:239)
   at 
 

[jira] [Commented] (CASSANDRA-2810) RuntimeException in Pig when using dump command on column name

2011-09-28 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13116856#comment-13116856
 ] 

Hudson commented on CASSANDRA-2810:
---

Integrated in Cassandra-0.8 #348 (See 
[https://builds.apache.org/job/Cassandra-0.8/348/])
Fix handling of integer types in pig.
Patch by brandonwilliams, reviewed by Jeremy Hanna for CASSANDRA-2810

brandonwilliams : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1177084
Files : 
* 
/cassandra/branches/cassandra-0.8/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java


 RuntimeException in Pig when using dump command on column name
 

 Key: CASSANDRA-2810
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2810
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.1
 Environment: Ubuntu 10.10, 32 bits
 java version 1.6.0_24
 Brisk beta-2 installed from Debian packages
Reporter: Silvère Lestang
Assignee: Brandon Williams
 Fix For: 0.8.7

 Attachments: 2810-v2.txt, 2810-v3.txt, 2810.txt


 This bug was previously report on [Brisk bug 
 tracker|https://datastax.jira.com/browse/BRISK-232].
 In cassandra-cli:
 {code}
 [default@unknown] create keyspace Test
 with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy'
 and strategy_options = [{replication_factor:1}];
 [default@unknown] use Test;
 Authenticated to keyspace: Test
 [default@Test] create column family test;
 [default@Test] set test[ascii('row1')][long(1)]=integer(35);
 set test[ascii('row1')][long(2)]=integer(36);
 set test[ascii('row1')][long(3)]=integer(38);
 set test[ascii('row2')][long(1)]=integer(45);
 set test[ascii('row2')][long(2)]=integer(42);
 set test[ascii('row2')][long(3)]=integer(33);
 [default@Test] list test;
 Using default limit of 100
 ---
 RowKey: 726f7731
 = (column=0001, value=35, timestamp=1308744931122000)
 = (column=0002, value=36, timestamp=1308744931124000)
 = (column=0003, value=38, timestamp=1308744931125000)
 ---
 RowKey: 726f7732
 = (column=0001, value=45, timestamp=1308744931127000)
 = (column=0002, value=42, timestamp=1308744931128000)
 = (column=0003, value=33, timestamp=1308744932722000)
 2 Rows Returned.
 [default@Test] describe keyspace;
 Keyspace: Test:
   Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
   Durable Writes: true
 Options: [replication_factor:1]
   Column Families:
 ColumnFamily: test
   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
   Default column value validator: 
 org.apache.cassandra.db.marshal.BytesType
   Columns sorted by: org.apache.cassandra.db.marshal.BytesType
   Row cache size / save period in seconds: 0.0/0
   Key cache size / save period in seconds: 20.0/14400
   Memtable thresholds: 0.571875/122/1440 (millions of ops/MB/minutes)
   GC grace seconds: 864000
   Compaction min/max thresholds: 4/32
   Read repair chance: 1.0
   Replicate on write: false
   Built indexes: []
 {code}
 In Pig command line:
 {code}
 grunt test = LOAD 'cassandra://Test/test' USING CassandraStorage() AS 
 (rowkey:chararray, columns: bag {T: (name:long, value:int)});
 grunt value_test = foreach test generate rowkey, columns.name, columns.value;
 grunt dump value_test;
 {code}
 In /var/log/cassandra/system.log, I have severals time this exception:
 {code}
 INFO [IPC Server handler 3 on 8012] 2011-06-22 15:03:28,533 
 TaskInProgress.java (line 551) Error from 
 attempt_201106210955_0051_m_00_3: java.lang.RuntimeException: Unexpected 
 data type -1 found in stream.
   at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:478)
   at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541)
   at org.apache.pig.data.BinInterSedes.writeBag(BinInterSedes.java:522)
   at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:361)
   at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541)
   at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:357)
   at 
 org.apache.pig.impl.io.InterRecordWriter.write(InterRecordWriter.java:73)
   at org.apache.pig.impl.io.InterStorage.putNext(InterStorage.java:87)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97)
   at 
 org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:638)
   at 
 

[jira] [Commented] (CASSANDRA-2810) RuntimeException in Pig when using dump command on column name

2011-07-08 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062128#comment-13062128
 ] 

Brandon Williams commented on CASSANDRA-2810:
-

So is the conclusion that this patch by itself works fine, but there is a 
problem with CASSANDRA-2777?

 RuntimeException in Pig when using dump command on column name
 

 Key: CASSANDRA-2810
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2810
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.1
 Environment: Ubuntu 10.10, 32 bits
 java version 1.6.0_24
 Brisk beta-2 installed from Debian packages
Reporter: Silvère Lestang
Assignee: Brandon Williams
 Attachments: 2810.txt


 This bug was previously report on [Brisk bug 
 tracker|https://datastax.jira.com/browse/BRISK-232].
 In cassandra-cli:
 {code}
 [default@unknown] create keyspace Test
 with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy'
 and strategy_options = [{replication_factor:1}];
 [default@unknown] use Test;
 Authenticated to keyspace: Test
 [default@Test] create column family test;
 [default@Test] set test[ascii('row1')][long(1)]=integer(35);
 set test[ascii('row1')][long(2)]=integer(36);
 set test[ascii('row1')][long(3)]=integer(38);
 set test[ascii('row2')][long(1)]=integer(45);
 set test[ascii('row2')][long(2)]=integer(42);
 set test[ascii('row2')][long(3)]=integer(33);
 [default@Test] list test;
 Using default limit of 100
 ---
 RowKey: 726f7731
 = (column=0001, value=35, timestamp=1308744931122000)
 = (column=0002, value=36, timestamp=1308744931124000)
 = (column=0003, value=38, timestamp=1308744931125000)
 ---
 RowKey: 726f7732
 = (column=0001, value=45, timestamp=1308744931127000)
 = (column=0002, value=42, timestamp=1308744931128000)
 = (column=0003, value=33, timestamp=1308744932722000)
 2 Rows Returned.
 [default@Test] describe keyspace;
 Keyspace: Test:
   Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
   Durable Writes: true
 Options: [replication_factor:1]
   Column Families:
 ColumnFamily: test
   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
   Default column value validator: 
 org.apache.cassandra.db.marshal.BytesType
   Columns sorted by: org.apache.cassandra.db.marshal.BytesType
   Row cache size / save period in seconds: 0.0/0
   Key cache size / save period in seconds: 20.0/14400
   Memtable thresholds: 0.571875/122/1440 (millions of ops/MB/minutes)
   GC grace seconds: 864000
   Compaction min/max thresholds: 4/32
   Read repair chance: 1.0
   Replicate on write: false
   Built indexes: []
 {code}
 In Pig command line:
 {code}
 grunt test = LOAD 'cassandra://Test/test' USING CassandraStorage() AS 
 (rowkey:chararray, columns: bag {T: (name:long, value:int)});
 grunt value_test = foreach test generate rowkey, columns.name, columns.value;
 grunt dump value_test;
 {code}
 In /var/log/cassandra/system.log, I have severals time this exception:
 {code}
 INFO [IPC Server handler 3 on 8012] 2011-06-22 15:03:28,533 
 TaskInProgress.java (line 551) Error from 
 attempt_201106210955_0051_m_00_3: java.lang.RuntimeException: Unexpected 
 data type -1 found in stream.
   at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:478)
   at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541)
   at org.apache.pig.data.BinInterSedes.writeBag(BinInterSedes.java:522)
   at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:361)
   at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541)
   at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:357)
   at 
 org.apache.pig.impl.io.InterRecordWriter.write(InterRecordWriter.java:73)
   at org.apache.pig.impl.io.InterStorage.putNext(InterStorage.java:87)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97)
   at 
 org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:638)
   at 
 org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:239)
   at 
 

[jira] [Commented] (CASSANDRA-2810) RuntimeException in Pig when using dump command on column name

2011-06-27 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13055521#comment-13055521
 ] 

Silvère Lestang commented on CASSANDRA-2810:


I try again after applying [^2810.txt] and the patch from bug [CASSANDRA-2777] 
and the bug is still here.
With the patch, you need to replace
{code}
test = LOAD 'cassandra://Test/test' USING CassandraStorage() AS 
(rowkey:chararray, columns: bag {T: (name:long, value:int)});
{code}
by
{code}
test = LOAD 'cassandra://Test/test' USING CassandraStorage() AS ();
{code}
because CassandraStorage takes care of the schema.

I try:
{code}
grunt describe test;
test: {key: chararray,columns: {(name: long,value: int)}}
{code}
so we can see that the patch from bug 2777 works correctly (I also test with 
different types for value).
But when I dump test, I still have the same exception.

 RuntimeException in Pig when using dump command on column name
 

 Key: CASSANDRA-2810
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2810
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.1
 Environment: Ubuntu 10.10, 32 bits
 java version 1.6.0_24
 Brisk beta-2 installed from Debian packages
Reporter: Silvère Lestang
Assignee: Brandon Williams
 Attachments: 2810.txt


 This bug was previously report on [Brisk bug 
 tracker|https://datastax.jira.com/browse/BRISK-232].
 In cassandra-cli:
 {code}
 [default@unknown] create keyspace Test
 with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy'
 and strategy_options = [{replication_factor:1}];
 [default@unknown] use Test;
 Authenticated to keyspace: Test
 [default@Test] create column family test;
 [default@Test] set test[ascii('row1')][long(1)]=integer(35);
 set test[ascii('row1')][long(2)]=integer(36);
 set test[ascii('row1')][long(3)]=integer(38);
 set test[ascii('row2')][long(1)]=integer(45);
 set test[ascii('row2')][long(2)]=integer(42);
 set test[ascii('row2')][long(3)]=integer(33);
 [default@Test] list test;
 Using default limit of 100
 ---
 RowKey: 726f7731
 = (column=0001, value=35, timestamp=1308744931122000)
 = (column=0002, value=36, timestamp=1308744931124000)
 = (column=0003, value=38, timestamp=1308744931125000)
 ---
 RowKey: 726f7732
 = (column=0001, value=45, timestamp=1308744931127000)
 = (column=0002, value=42, timestamp=1308744931128000)
 = (column=0003, value=33, timestamp=1308744932722000)
 2 Rows Returned.
 [default@Test] describe keyspace;
 Keyspace: Test:
   Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
   Durable Writes: true
 Options: [replication_factor:1]
   Column Families:
 ColumnFamily: test
   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
   Default column value validator: 
 org.apache.cassandra.db.marshal.BytesType
   Columns sorted by: org.apache.cassandra.db.marshal.BytesType
   Row cache size / save period in seconds: 0.0/0
   Key cache size / save period in seconds: 20.0/14400
   Memtable thresholds: 0.571875/122/1440 (millions of ops/MB/minutes)
   GC grace seconds: 864000
   Compaction min/max thresholds: 4/32
   Read repair chance: 1.0
   Replicate on write: false
   Built indexes: []
 {code}
 In Pig command line:
 {code}
 grunt test = LOAD 'cassandra://Test/test' USING CassandraStorage() AS 
 (rowkey:chararray, columns: bag {T: (name:long, value:int)});
 grunt value_test = foreach test generate rowkey, columns.name, columns.value;
 grunt dump value_test;
 {code}
 In /var/log/cassandra/system.log, I have severals time this exception:
 {code}
 INFO [IPC Server handler 3 on 8012] 2011-06-22 15:03:28,533 
 TaskInProgress.java (line 551) Error from 
 attempt_201106210955_0051_m_00_3: java.lang.RuntimeException: Unexpected 
 data type -1 found in stream.
   at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:478)
   at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541)
   at org.apache.pig.data.BinInterSedes.writeBag(BinInterSedes.java:522)
   at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:361)
   at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541)
   at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:357)
   at 
 org.apache.pig.impl.io.InterRecordWriter.write(InterRecordWriter.java:73)
   at org.apache.pig.impl.io.InterStorage.putNext(InterStorage.java:87)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138)
   at 
 

[jira] [Commented] (CASSANDRA-2810) RuntimeException in Pig when using dump command on column name

2011-06-27 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13055585#comment-13055585
 ] 

Silvère Lestang commented on CASSANDRA-2810:


After more test (with both patches), path [^2810.txt] doesn't seems to solve 
the bug.
Here is a new test case:
Create a _Test_ keyspace and a _test_ column family with key_validation_class = 
'AsciiType' and comparator = 'LongType' and default_validation_class = 
'IntegerType' (don't use the cli because of [#CASSANDRA-2831]).
Insert some data:
{code}
set test[ascii('row1')][long(1)]=integer(35);
set test[ascii('row1')][long(2)]=integer(36);
set test[ascii('row1')][long(3)]=integer(38);
set test[ascii('row2')][long(1)]=integer(45);
set test[ascii('row2')][long(2)]=integer(42);
set test[ascii('row2')][long(3)]=integer(33);
{code}

In Pig cli:
{code}
test = LOAD 'cassandra://Test/test' USING CassandraStorage() AS ();
dump test;
{code}
The same exception as before is raised:
{code}
 INFO [IPC Server handler 4 on 8012] 2011-06-27 16:40:28,562 
TaskInProgress.java (line 551) Error from attempt_201106271436_0012_m_00_1: 
java.lang.RuntimeException: Unexpected data type -1 found in stream.
at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:478)
at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541)
at org.apache.pig.data.BinInterSedes.writeBag(BinInterSedes.java:522)
at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:361)
at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541)
at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:357)
at 
org.apache.pig.impl.io.InterRecordWriter.write(InterRecordWriter.java:73)
at org.apache.pig.impl.io.InterStorage.putNext(InterStorage.java:87)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97)
at 
org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:638)
at 
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:224)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
at org.apache.hadoop.mapred.Child.main(Child.java:253)

{code}

 RuntimeException in Pig when using dump command on column name
 

 Key: CASSANDRA-2810
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2810
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.1
 Environment: Ubuntu 10.10, 32 bits
 java version 1.6.0_24
 Brisk beta-2 installed from Debian packages
Reporter: Silvère Lestang
Assignee: Brandon Williams
 Attachments: 2810.txt


 This bug was previously report on [Brisk bug 
 tracker|https://datastax.jira.com/browse/BRISK-232].
 In cassandra-cli:
 {code}
 [default@unknown] create keyspace Test
 with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy'
 and strategy_options = [{replication_factor:1}];
 [default@unknown] use Test;
 Authenticated to keyspace: Test
 [default@Test] create column family test;
 [default@Test] set test[ascii('row1')][long(1)]=integer(35);
 set test[ascii('row1')][long(2)]=integer(36);
 set test[ascii('row1')][long(3)]=integer(38);
 set test[ascii('row2')][long(1)]=integer(45);
 set test[ascii('row2')][long(2)]=integer(42);
 set test[ascii('row2')][long(3)]=integer(33);
 [default@Test] list test;
 Using default limit of 100
 ---
 RowKey: 726f7731
 = (column=0001, value=35, timestamp=1308744931122000)
 = (column=0002, value=36, timestamp=1308744931124000)
 = (column=0003, value=38, timestamp=1308744931125000)
 ---
 RowKey: 726f7732
 = (column=0001, value=45, timestamp=1308744931127000)
 = (column=0002, value=42, 

[jira] [Commented] (CASSANDRA-2810) RuntimeException in Pig when using dump command on column name

2011-06-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054485#comment-13054485
 ] 

Jonathan Ellis commented on CASSANDRA-2810:
---

DataByteArray is some kind of Pig thing?

 RuntimeException in Pig when using dump command on column name
 

 Key: CASSANDRA-2810
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2810
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.1
 Environment: Ubuntu 10.10, 32 bits
 java version 1.6.0_24
 Brisk beta-2 installed from Debian packages
Reporter: Silvère Lestang
Assignee: Brandon Williams
 Attachments: 2810.txt


 This bug was previously report on [Brisk bug 
 tracker|https://datastax.jira.com/browse/BRISK-232].
 In cassandra-cli:
 {code}
 [default@unknown] create keyspace Test
 with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy'
 and strategy_options = [{replication_factor:1}];
 [default@unknown] use Test;
 Authenticated to keyspace: Test
 [default@Test] create column family test;
 [default@Test] set test[ascii('row1')][long(1)]=integer(35);
 set test[ascii('row1')][long(2)]=integer(36);
 set test[ascii('row1')][long(3)]=integer(38);
 set test[ascii('row2')][long(1)]=integer(45);
 set test[ascii('row2')][long(2)]=integer(42);
 set test[ascii('row2')][long(3)]=integer(33);
 [default@Test] list test;
 Using default limit of 100
 ---
 RowKey: 726f7731
 = (column=0001, value=35, timestamp=1308744931122000)
 = (column=0002, value=36, timestamp=1308744931124000)
 = (column=0003, value=38, timestamp=1308744931125000)
 ---
 RowKey: 726f7732
 = (column=0001, value=45, timestamp=1308744931127000)
 = (column=0002, value=42, timestamp=1308744931128000)
 = (column=0003, value=33, timestamp=1308744932722000)
 2 Rows Returned.
 [default@Test] describe keyspace;
 Keyspace: Test:
   Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
   Durable Writes: true
 Options: [replication_factor:1]
   Column Families:
 ColumnFamily: test
   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
   Default column value validator: 
 org.apache.cassandra.db.marshal.BytesType
   Columns sorted by: org.apache.cassandra.db.marshal.BytesType
   Row cache size / save period in seconds: 0.0/0
   Key cache size / save period in seconds: 20.0/14400
   Memtable thresholds: 0.571875/122/1440 (millions of ops/MB/minutes)
   GC grace seconds: 864000
   Compaction min/max thresholds: 4/32
   Read repair chance: 1.0
   Replicate on write: false
   Built indexes: []
 {code}
 In Pig command line:
 {code}
 grunt test = LOAD 'cassandra://Test/test' USING CassandraStorage() AS 
 (rowkey:chararray, columns: bag {T: (name:long, value:int)});
 grunt value_test = foreach test generate rowkey, columns.name, columns.value;
 grunt dump value_test;
 {code}
 In /var/log/cassandra/system.log, I have severals time this exception:
 {code}
 INFO [IPC Server handler 3 on 8012] 2011-06-22 15:03:28,533 
 TaskInProgress.java (line 551) Error from 
 attempt_201106210955_0051_m_00_3: java.lang.RuntimeException: Unexpected 
 data type -1 found in stream.
   at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:478)
   at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541)
   at org.apache.pig.data.BinInterSedes.writeBag(BinInterSedes.java:522)
   at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:361)
   at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541)
   at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:357)
   at 
 org.apache.pig.impl.io.InterRecordWriter.write(InterRecordWriter.java:73)
   at org.apache.pig.impl.io.InterStorage.putNext(InterStorage.java:87)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97)
   at 
 org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:638)
   at 
 org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:239)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:232)
   at 
 

[jira] [Commented] (CASSANDRA-2810) RuntimeException in Pig when using dump command on column name

2011-06-24 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054494#comment-13054494
 ] 

Brandon Williams commented on CASSANDRA-2810:
-

Yes, basically a byte array, but it's the pig type.

 RuntimeException in Pig when using dump command on column name
 

 Key: CASSANDRA-2810
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2810
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.1
 Environment: Ubuntu 10.10, 32 bits
 java version 1.6.0_24
 Brisk beta-2 installed from Debian packages
Reporter: Silvère Lestang
Assignee: Brandon Williams
 Attachments: 2810.txt


 This bug was previously report on [Brisk bug 
 tracker|https://datastax.jira.com/browse/BRISK-232].
 In cassandra-cli:
 {code}
 [default@unknown] create keyspace Test
 with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy'
 and strategy_options = [{replication_factor:1}];
 [default@unknown] use Test;
 Authenticated to keyspace: Test
 [default@Test] create column family test;
 [default@Test] set test[ascii('row1')][long(1)]=integer(35);
 set test[ascii('row1')][long(2)]=integer(36);
 set test[ascii('row1')][long(3)]=integer(38);
 set test[ascii('row2')][long(1)]=integer(45);
 set test[ascii('row2')][long(2)]=integer(42);
 set test[ascii('row2')][long(3)]=integer(33);
 [default@Test] list test;
 Using default limit of 100
 ---
 RowKey: 726f7731
 = (column=0001, value=35, timestamp=1308744931122000)
 = (column=0002, value=36, timestamp=1308744931124000)
 = (column=0003, value=38, timestamp=1308744931125000)
 ---
 RowKey: 726f7732
 = (column=0001, value=45, timestamp=1308744931127000)
 = (column=0002, value=42, timestamp=1308744931128000)
 = (column=0003, value=33, timestamp=1308744932722000)
 2 Rows Returned.
 [default@Test] describe keyspace;
 Keyspace: Test:
   Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
   Durable Writes: true
 Options: [replication_factor:1]
   Column Families:
 ColumnFamily: test
   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
   Default column value validator: 
 org.apache.cassandra.db.marshal.BytesType
   Columns sorted by: org.apache.cassandra.db.marshal.BytesType
   Row cache size / save period in seconds: 0.0/0
   Key cache size / save period in seconds: 20.0/14400
   Memtable thresholds: 0.571875/122/1440 (millions of ops/MB/minutes)
   GC grace seconds: 864000
   Compaction min/max thresholds: 4/32
   Read repair chance: 1.0
   Replicate on write: false
   Built indexes: []
 {code}
 In Pig command line:
 {code}
 grunt test = LOAD 'cassandra://Test/test' USING CassandraStorage() AS 
 (rowkey:chararray, columns: bag {T: (name:long, value:int)});
 grunt value_test = foreach test generate rowkey, columns.name, columns.value;
 grunt dump value_test;
 {code}
 In /var/log/cassandra/system.log, I have severals time this exception:
 {code}
 INFO [IPC Server handler 3 on 8012] 2011-06-22 15:03:28,533 
 TaskInProgress.java (line 551) Error from 
 attempt_201106210955_0051_m_00_3: java.lang.RuntimeException: Unexpected 
 data type -1 found in stream.
   at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:478)
   at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541)
   at org.apache.pig.data.BinInterSedes.writeBag(BinInterSedes.java:522)
   at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:361)
   at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541)
   at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:357)
   at 
 org.apache.pig.impl.io.InterRecordWriter.write(InterRecordWriter.java:73)
   at org.apache.pig.impl.io.InterStorage.putNext(InterStorage.java:87)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97)
   at 
 org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:638)
   at 
 org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:239)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:232)
   at