[jira] [Commented] (CASSANDRA-2810) RuntimeException in Pig when using dump command on column name
[ https://issues.apache.org/jira/browse/CASSANDRA-2810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13116818#comment-13116818 ] Steeve Morin commented on CASSANDRA-2810: - Fixed it for me on Pig 0.9 and Cassandra 0.8.6 (Brisk). RuntimeException in Pig when using dump command on column name Key: CASSANDRA-2810 URL: https://issues.apache.org/jira/browse/CASSANDRA-2810 Project: Cassandra Issue Type: Bug Affects Versions: 0.8.1 Environment: Ubuntu 10.10, 32 bits java version 1.6.0_24 Brisk beta-2 installed from Debian packages Reporter: Silvère Lestang Assignee: Brandon Williams Attachments: 2810-v2.txt, 2810-v3.txt, 2810.txt This bug was previously report on [Brisk bug tracker|https://datastax.jira.com/browse/BRISK-232]. In cassandra-cli: {code} [default@unknown] create keyspace Test with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options = [{replication_factor:1}]; [default@unknown] use Test; Authenticated to keyspace: Test [default@Test] create column family test; [default@Test] set test[ascii('row1')][long(1)]=integer(35); set test[ascii('row1')][long(2)]=integer(36); set test[ascii('row1')][long(3)]=integer(38); set test[ascii('row2')][long(1)]=integer(45); set test[ascii('row2')][long(2)]=integer(42); set test[ascii('row2')][long(3)]=integer(33); [default@Test] list test; Using default limit of 100 --- RowKey: 726f7731 = (column=0001, value=35, timestamp=1308744931122000) = (column=0002, value=36, timestamp=1308744931124000) = (column=0003, value=38, timestamp=1308744931125000) --- RowKey: 726f7732 = (column=0001, value=45, timestamp=1308744931127000) = (column=0002, value=42, timestamp=1308744931128000) = (column=0003, value=33, timestamp=1308744932722000) 2 Rows Returned. [default@Test] describe keyspace; Keyspace: Test: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:1] Column Families: ColumnFamily: test Key Validation Class: org.apache.cassandra.db.marshal.BytesType Default column value validator: org.apache.cassandra.db.marshal.BytesType Columns sorted by: org.apache.cassandra.db.marshal.BytesType Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 20.0/14400 Memtable thresholds: 0.571875/122/1440 (millions of ops/MB/minutes) GC grace seconds: 864000 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: false Built indexes: [] {code} In Pig command line: {code} grunt test = LOAD 'cassandra://Test/test' USING CassandraStorage() AS (rowkey:chararray, columns: bag {T: (name:long, value:int)}); grunt value_test = foreach test generate rowkey, columns.name, columns.value; grunt dump value_test; {code} In /var/log/cassandra/system.log, I have severals time this exception: {code} INFO [IPC Server handler 3 on 8012] 2011-06-22 15:03:28,533 TaskInProgress.java (line 551) Error from attempt_201106210955_0051_m_00_3: java.lang.RuntimeException: Unexpected data type -1 found in stream. at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:478) at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541) at org.apache.pig.data.BinInterSedes.writeBag(BinInterSedes.java:522) at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:361) at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541) at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:357) at org.apache.pig.impl.io.InterRecordWriter.write(InterRecordWriter.java:73) at org.apache.pig.impl.io.InterStorage.putNext(InterStorage.java:87) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:638) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:239) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:232)
[jira] [Commented] (CASSANDRA-2810) RuntimeException in Pig when using dump command on column name
[ https://issues.apache.org/jira/browse/CASSANDRA-2810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13116825#comment-13116825 ] Jeremy Hanna commented on CASSANDRA-2810: - +1 - if we find any issues with it in production, we'll submit bug reports. RuntimeException in Pig when using dump command on column name Key: CASSANDRA-2810 URL: https://issues.apache.org/jira/browse/CASSANDRA-2810 Project: Cassandra Issue Type: Bug Affects Versions: 0.8.1 Environment: Ubuntu 10.10, 32 bits java version 1.6.0_24 Brisk beta-2 installed from Debian packages Reporter: Silvère Lestang Assignee: Brandon Williams Attachments: 2810-v2.txt, 2810-v3.txt, 2810.txt This bug was previously report on [Brisk bug tracker|https://datastax.jira.com/browse/BRISK-232]. In cassandra-cli: {code} [default@unknown] create keyspace Test with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options = [{replication_factor:1}]; [default@unknown] use Test; Authenticated to keyspace: Test [default@Test] create column family test; [default@Test] set test[ascii('row1')][long(1)]=integer(35); set test[ascii('row1')][long(2)]=integer(36); set test[ascii('row1')][long(3)]=integer(38); set test[ascii('row2')][long(1)]=integer(45); set test[ascii('row2')][long(2)]=integer(42); set test[ascii('row2')][long(3)]=integer(33); [default@Test] list test; Using default limit of 100 --- RowKey: 726f7731 = (column=0001, value=35, timestamp=1308744931122000) = (column=0002, value=36, timestamp=1308744931124000) = (column=0003, value=38, timestamp=1308744931125000) --- RowKey: 726f7732 = (column=0001, value=45, timestamp=1308744931127000) = (column=0002, value=42, timestamp=1308744931128000) = (column=0003, value=33, timestamp=1308744932722000) 2 Rows Returned. [default@Test] describe keyspace; Keyspace: Test: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:1] Column Families: ColumnFamily: test Key Validation Class: org.apache.cassandra.db.marshal.BytesType Default column value validator: org.apache.cassandra.db.marshal.BytesType Columns sorted by: org.apache.cassandra.db.marshal.BytesType Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 20.0/14400 Memtable thresholds: 0.571875/122/1440 (millions of ops/MB/minutes) GC grace seconds: 864000 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: false Built indexes: [] {code} In Pig command line: {code} grunt test = LOAD 'cassandra://Test/test' USING CassandraStorage() AS (rowkey:chararray, columns: bag {T: (name:long, value:int)}); grunt value_test = foreach test generate rowkey, columns.name, columns.value; grunt dump value_test; {code} In /var/log/cassandra/system.log, I have severals time this exception: {code} INFO [IPC Server handler 3 on 8012] 2011-06-22 15:03:28,533 TaskInProgress.java (line 551) Error from attempt_201106210955_0051_m_00_3: java.lang.RuntimeException: Unexpected data type -1 found in stream. at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:478) at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541) at org.apache.pig.data.BinInterSedes.writeBag(BinInterSedes.java:522) at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:361) at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541) at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:357) at org.apache.pig.impl.io.InterRecordWriter.write(InterRecordWriter.java:73) at org.apache.pig.impl.io.InterStorage.putNext(InterStorage.java:87) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:638) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:239) at
[jira] [Commented] (CASSANDRA-2810) RuntimeException in Pig when using dump command on column name
[ https://issues.apache.org/jira/browse/CASSANDRA-2810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13116856#comment-13116856 ] Hudson commented on CASSANDRA-2810: --- Integrated in Cassandra-0.8 #348 (See [https://builds.apache.org/job/Cassandra-0.8/348/]) Fix handling of integer types in pig. Patch by brandonwilliams, reviewed by Jeremy Hanna for CASSANDRA-2810 brandonwilliams : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1177084 Files : * /cassandra/branches/cassandra-0.8/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java RuntimeException in Pig when using dump command on column name Key: CASSANDRA-2810 URL: https://issues.apache.org/jira/browse/CASSANDRA-2810 Project: Cassandra Issue Type: Bug Affects Versions: 0.8.1 Environment: Ubuntu 10.10, 32 bits java version 1.6.0_24 Brisk beta-2 installed from Debian packages Reporter: Silvère Lestang Assignee: Brandon Williams Fix For: 0.8.7 Attachments: 2810-v2.txt, 2810-v3.txt, 2810.txt This bug was previously report on [Brisk bug tracker|https://datastax.jira.com/browse/BRISK-232]. In cassandra-cli: {code} [default@unknown] create keyspace Test with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options = [{replication_factor:1}]; [default@unknown] use Test; Authenticated to keyspace: Test [default@Test] create column family test; [default@Test] set test[ascii('row1')][long(1)]=integer(35); set test[ascii('row1')][long(2)]=integer(36); set test[ascii('row1')][long(3)]=integer(38); set test[ascii('row2')][long(1)]=integer(45); set test[ascii('row2')][long(2)]=integer(42); set test[ascii('row2')][long(3)]=integer(33); [default@Test] list test; Using default limit of 100 --- RowKey: 726f7731 = (column=0001, value=35, timestamp=1308744931122000) = (column=0002, value=36, timestamp=1308744931124000) = (column=0003, value=38, timestamp=1308744931125000) --- RowKey: 726f7732 = (column=0001, value=45, timestamp=1308744931127000) = (column=0002, value=42, timestamp=1308744931128000) = (column=0003, value=33, timestamp=1308744932722000) 2 Rows Returned. [default@Test] describe keyspace; Keyspace: Test: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:1] Column Families: ColumnFamily: test Key Validation Class: org.apache.cassandra.db.marshal.BytesType Default column value validator: org.apache.cassandra.db.marshal.BytesType Columns sorted by: org.apache.cassandra.db.marshal.BytesType Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 20.0/14400 Memtable thresholds: 0.571875/122/1440 (millions of ops/MB/minutes) GC grace seconds: 864000 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: false Built indexes: [] {code} In Pig command line: {code} grunt test = LOAD 'cassandra://Test/test' USING CassandraStorage() AS (rowkey:chararray, columns: bag {T: (name:long, value:int)}); grunt value_test = foreach test generate rowkey, columns.name, columns.value; grunt dump value_test; {code} In /var/log/cassandra/system.log, I have severals time this exception: {code} INFO [IPC Server handler 3 on 8012] 2011-06-22 15:03:28,533 TaskInProgress.java (line 551) Error from attempt_201106210955_0051_m_00_3: java.lang.RuntimeException: Unexpected data type -1 found in stream. at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:478) at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541) at org.apache.pig.data.BinInterSedes.writeBag(BinInterSedes.java:522) at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:361) at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541) at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:357) at org.apache.pig.impl.io.InterRecordWriter.write(InterRecordWriter.java:73) at org.apache.pig.impl.io.InterStorage.putNext(InterStorage.java:87) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:638) at
[jira] [Commented] (CASSANDRA-2810) RuntimeException in Pig when using dump command on column name
[ https://issues.apache.org/jira/browse/CASSANDRA-2810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062128#comment-13062128 ] Brandon Williams commented on CASSANDRA-2810: - So is the conclusion that this patch by itself works fine, but there is a problem with CASSANDRA-2777? RuntimeException in Pig when using dump command on column name Key: CASSANDRA-2810 URL: https://issues.apache.org/jira/browse/CASSANDRA-2810 Project: Cassandra Issue Type: Bug Affects Versions: 0.8.1 Environment: Ubuntu 10.10, 32 bits java version 1.6.0_24 Brisk beta-2 installed from Debian packages Reporter: Silvère Lestang Assignee: Brandon Williams Attachments: 2810.txt This bug was previously report on [Brisk bug tracker|https://datastax.jira.com/browse/BRISK-232]. In cassandra-cli: {code} [default@unknown] create keyspace Test with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options = [{replication_factor:1}]; [default@unknown] use Test; Authenticated to keyspace: Test [default@Test] create column family test; [default@Test] set test[ascii('row1')][long(1)]=integer(35); set test[ascii('row1')][long(2)]=integer(36); set test[ascii('row1')][long(3)]=integer(38); set test[ascii('row2')][long(1)]=integer(45); set test[ascii('row2')][long(2)]=integer(42); set test[ascii('row2')][long(3)]=integer(33); [default@Test] list test; Using default limit of 100 --- RowKey: 726f7731 = (column=0001, value=35, timestamp=1308744931122000) = (column=0002, value=36, timestamp=1308744931124000) = (column=0003, value=38, timestamp=1308744931125000) --- RowKey: 726f7732 = (column=0001, value=45, timestamp=1308744931127000) = (column=0002, value=42, timestamp=1308744931128000) = (column=0003, value=33, timestamp=1308744932722000) 2 Rows Returned. [default@Test] describe keyspace; Keyspace: Test: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:1] Column Families: ColumnFamily: test Key Validation Class: org.apache.cassandra.db.marshal.BytesType Default column value validator: org.apache.cassandra.db.marshal.BytesType Columns sorted by: org.apache.cassandra.db.marshal.BytesType Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 20.0/14400 Memtable thresholds: 0.571875/122/1440 (millions of ops/MB/minutes) GC grace seconds: 864000 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: false Built indexes: [] {code} In Pig command line: {code} grunt test = LOAD 'cassandra://Test/test' USING CassandraStorage() AS (rowkey:chararray, columns: bag {T: (name:long, value:int)}); grunt value_test = foreach test generate rowkey, columns.name, columns.value; grunt dump value_test; {code} In /var/log/cassandra/system.log, I have severals time this exception: {code} INFO [IPC Server handler 3 on 8012] 2011-06-22 15:03:28,533 TaskInProgress.java (line 551) Error from attempt_201106210955_0051_m_00_3: java.lang.RuntimeException: Unexpected data type -1 found in stream. at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:478) at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541) at org.apache.pig.data.BinInterSedes.writeBag(BinInterSedes.java:522) at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:361) at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541) at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:357) at org.apache.pig.impl.io.InterRecordWriter.write(InterRecordWriter.java:73) at org.apache.pig.impl.io.InterStorage.putNext(InterStorage.java:87) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:638) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:239) at
[jira] [Commented] (CASSANDRA-2810) RuntimeException in Pig when using dump command on column name
[ https://issues.apache.org/jira/browse/CASSANDRA-2810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13055521#comment-13055521 ] Silvère Lestang commented on CASSANDRA-2810: I try again after applying [^2810.txt] and the patch from bug [CASSANDRA-2777] and the bug is still here. With the patch, you need to replace {code} test = LOAD 'cassandra://Test/test' USING CassandraStorage() AS (rowkey:chararray, columns: bag {T: (name:long, value:int)}); {code} by {code} test = LOAD 'cassandra://Test/test' USING CassandraStorage() AS (); {code} because CassandraStorage takes care of the schema. I try: {code} grunt describe test; test: {key: chararray,columns: {(name: long,value: int)}} {code} so we can see that the patch from bug 2777 works correctly (I also test with different types for value). But when I dump test, I still have the same exception. RuntimeException in Pig when using dump command on column name Key: CASSANDRA-2810 URL: https://issues.apache.org/jira/browse/CASSANDRA-2810 Project: Cassandra Issue Type: Bug Affects Versions: 0.8.1 Environment: Ubuntu 10.10, 32 bits java version 1.6.0_24 Brisk beta-2 installed from Debian packages Reporter: Silvère Lestang Assignee: Brandon Williams Attachments: 2810.txt This bug was previously report on [Brisk bug tracker|https://datastax.jira.com/browse/BRISK-232]. In cassandra-cli: {code} [default@unknown] create keyspace Test with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options = [{replication_factor:1}]; [default@unknown] use Test; Authenticated to keyspace: Test [default@Test] create column family test; [default@Test] set test[ascii('row1')][long(1)]=integer(35); set test[ascii('row1')][long(2)]=integer(36); set test[ascii('row1')][long(3)]=integer(38); set test[ascii('row2')][long(1)]=integer(45); set test[ascii('row2')][long(2)]=integer(42); set test[ascii('row2')][long(3)]=integer(33); [default@Test] list test; Using default limit of 100 --- RowKey: 726f7731 = (column=0001, value=35, timestamp=1308744931122000) = (column=0002, value=36, timestamp=1308744931124000) = (column=0003, value=38, timestamp=1308744931125000) --- RowKey: 726f7732 = (column=0001, value=45, timestamp=1308744931127000) = (column=0002, value=42, timestamp=1308744931128000) = (column=0003, value=33, timestamp=1308744932722000) 2 Rows Returned. [default@Test] describe keyspace; Keyspace: Test: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:1] Column Families: ColumnFamily: test Key Validation Class: org.apache.cassandra.db.marshal.BytesType Default column value validator: org.apache.cassandra.db.marshal.BytesType Columns sorted by: org.apache.cassandra.db.marshal.BytesType Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 20.0/14400 Memtable thresholds: 0.571875/122/1440 (millions of ops/MB/minutes) GC grace seconds: 864000 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: false Built indexes: [] {code} In Pig command line: {code} grunt test = LOAD 'cassandra://Test/test' USING CassandraStorage() AS (rowkey:chararray, columns: bag {T: (name:long, value:int)}); grunt value_test = foreach test generate rowkey, columns.name, columns.value; grunt dump value_test; {code} In /var/log/cassandra/system.log, I have severals time this exception: {code} INFO [IPC Server handler 3 on 8012] 2011-06-22 15:03:28,533 TaskInProgress.java (line 551) Error from attempt_201106210955_0051_m_00_3: java.lang.RuntimeException: Unexpected data type -1 found in stream. at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:478) at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541) at org.apache.pig.data.BinInterSedes.writeBag(BinInterSedes.java:522) at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:361) at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541) at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:357) at org.apache.pig.impl.io.InterRecordWriter.write(InterRecordWriter.java:73) at org.apache.pig.impl.io.InterStorage.putNext(InterStorage.java:87) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138) at
[jira] [Commented] (CASSANDRA-2810) RuntimeException in Pig when using dump command on column name
[ https://issues.apache.org/jira/browse/CASSANDRA-2810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13055585#comment-13055585 ] Silvère Lestang commented on CASSANDRA-2810: After more test (with both patches), path [^2810.txt] doesn't seems to solve the bug. Here is a new test case: Create a _Test_ keyspace and a _test_ column family with key_validation_class = 'AsciiType' and comparator = 'LongType' and default_validation_class = 'IntegerType' (don't use the cli because of [#CASSANDRA-2831]). Insert some data: {code} set test[ascii('row1')][long(1)]=integer(35); set test[ascii('row1')][long(2)]=integer(36); set test[ascii('row1')][long(3)]=integer(38); set test[ascii('row2')][long(1)]=integer(45); set test[ascii('row2')][long(2)]=integer(42); set test[ascii('row2')][long(3)]=integer(33); {code} In Pig cli: {code} test = LOAD 'cassandra://Test/test' USING CassandraStorage() AS (); dump test; {code} The same exception as before is raised: {code} INFO [IPC Server handler 4 on 8012] 2011-06-27 16:40:28,562 TaskInProgress.java (line 551) Error from attempt_201106271436_0012_m_00_1: java.lang.RuntimeException: Unexpected data type -1 found in stream. at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:478) at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541) at org.apache.pig.data.BinInterSedes.writeBag(BinInterSedes.java:522) at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:361) at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541) at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:357) at org.apache.pig.impl.io.InterRecordWriter.write(InterRecordWriter.java:73) at org.apache.pig.impl.io.InterStorage.putNext(InterStorage.java:87) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:638) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:224) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369) at org.apache.hadoop.mapred.Child$4.run(Child.java:259) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.Child.main(Child.java:253) {code} RuntimeException in Pig when using dump command on column name Key: CASSANDRA-2810 URL: https://issues.apache.org/jira/browse/CASSANDRA-2810 Project: Cassandra Issue Type: Bug Affects Versions: 0.8.1 Environment: Ubuntu 10.10, 32 bits java version 1.6.0_24 Brisk beta-2 installed from Debian packages Reporter: Silvère Lestang Assignee: Brandon Williams Attachments: 2810.txt This bug was previously report on [Brisk bug tracker|https://datastax.jira.com/browse/BRISK-232]. In cassandra-cli: {code} [default@unknown] create keyspace Test with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options = [{replication_factor:1}]; [default@unknown] use Test; Authenticated to keyspace: Test [default@Test] create column family test; [default@Test] set test[ascii('row1')][long(1)]=integer(35); set test[ascii('row1')][long(2)]=integer(36); set test[ascii('row1')][long(3)]=integer(38); set test[ascii('row2')][long(1)]=integer(45); set test[ascii('row2')][long(2)]=integer(42); set test[ascii('row2')][long(3)]=integer(33); [default@Test] list test; Using default limit of 100 --- RowKey: 726f7731 = (column=0001, value=35, timestamp=1308744931122000) = (column=0002, value=36, timestamp=1308744931124000) = (column=0003, value=38, timestamp=1308744931125000) --- RowKey: 726f7732 = (column=0001, value=45, timestamp=1308744931127000) = (column=0002, value=42,
[jira] [Commented] (CASSANDRA-2810) RuntimeException in Pig when using dump command on column name
[ https://issues.apache.org/jira/browse/CASSANDRA-2810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054485#comment-13054485 ] Jonathan Ellis commented on CASSANDRA-2810: --- DataByteArray is some kind of Pig thing? RuntimeException in Pig when using dump command on column name Key: CASSANDRA-2810 URL: https://issues.apache.org/jira/browse/CASSANDRA-2810 Project: Cassandra Issue Type: Bug Affects Versions: 0.8.1 Environment: Ubuntu 10.10, 32 bits java version 1.6.0_24 Brisk beta-2 installed from Debian packages Reporter: Silvère Lestang Assignee: Brandon Williams Attachments: 2810.txt This bug was previously report on [Brisk bug tracker|https://datastax.jira.com/browse/BRISK-232]. In cassandra-cli: {code} [default@unknown] create keyspace Test with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options = [{replication_factor:1}]; [default@unknown] use Test; Authenticated to keyspace: Test [default@Test] create column family test; [default@Test] set test[ascii('row1')][long(1)]=integer(35); set test[ascii('row1')][long(2)]=integer(36); set test[ascii('row1')][long(3)]=integer(38); set test[ascii('row2')][long(1)]=integer(45); set test[ascii('row2')][long(2)]=integer(42); set test[ascii('row2')][long(3)]=integer(33); [default@Test] list test; Using default limit of 100 --- RowKey: 726f7731 = (column=0001, value=35, timestamp=1308744931122000) = (column=0002, value=36, timestamp=1308744931124000) = (column=0003, value=38, timestamp=1308744931125000) --- RowKey: 726f7732 = (column=0001, value=45, timestamp=1308744931127000) = (column=0002, value=42, timestamp=1308744931128000) = (column=0003, value=33, timestamp=1308744932722000) 2 Rows Returned. [default@Test] describe keyspace; Keyspace: Test: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:1] Column Families: ColumnFamily: test Key Validation Class: org.apache.cassandra.db.marshal.BytesType Default column value validator: org.apache.cassandra.db.marshal.BytesType Columns sorted by: org.apache.cassandra.db.marshal.BytesType Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 20.0/14400 Memtable thresholds: 0.571875/122/1440 (millions of ops/MB/minutes) GC grace seconds: 864000 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: false Built indexes: [] {code} In Pig command line: {code} grunt test = LOAD 'cassandra://Test/test' USING CassandraStorage() AS (rowkey:chararray, columns: bag {T: (name:long, value:int)}); grunt value_test = foreach test generate rowkey, columns.name, columns.value; grunt dump value_test; {code} In /var/log/cassandra/system.log, I have severals time this exception: {code} INFO [IPC Server handler 3 on 8012] 2011-06-22 15:03:28,533 TaskInProgress.java (line 551) Error from attempt_201106210955_0051_m_00_3: java.lang.RuntimeException: Unexpected data type -1 found in stream. at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:478) at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541) at org.apache.pig.data.BinInterSedes.writeBag(BinInterSedes.java:522) at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:361) at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541) at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:357) at org.apache.pig.impl.io.InterRecordWriter.write(InterRecordWriter.java:73) at org.apache.pig.impl.io.InterStorage.putNext(InterStorage.java:87) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:638) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:239) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:232) at
[jira] [Commented] (CASSANDRA-2810) RuntimeException in Pig when using dump command on column name
[ https://issues.apache.org/jira/browse/CASSANDRA-2810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054494#comment-13054494 ] Brandon Williams commented on CASSANDRA-2810: - Yes, basically a byte array, but it's the pig type. RuntimeException in Pig when using dump command on column name Key: CASSANDRA-2810 URL: https://issues.apache.org/jira/browse/CASSANDRA-2810 Project: Cassandra Issue Type: Bug Affects Versions: 0.8.1 Environment: Ubuntu 10.10, 32 bits java version 1.6.0_24 Brisk beta-2 installed from Debian packages Reporter: Silvère Lestang Assignee: Brandon Williams Attachments: 2810.txt This bug was previously report on [Brisk bug tracker|https://datastax.jira.com/browse/BRISK-232]. In cassandra-cli: {code} [default@unknown] create keyspace Test with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options = [{replication_factor:1}]; [default@unknown] use Test; Authenticated to keyspace: Test [default@Test] create column family test; [default@Test] set test[ascii('row1')][long(1)]=integer(35); set test[ascii('row1')][long(2)]=integer(36); set test[ascii('row1')][long(3)]=integer(38); set test[ascii('row2')][long(1)]=integer(45); set test[ascii('row2')][long(2)]=integer(42); set test[ascii('row2')][long(3)]=integer(33); [default@Test] list test; Using default limit of 100 --- RowKey: 726f7731 = (column=0001, value=35, timestamp=1308744931122000) = (column=0002, value=36, timestamp=1308744931124000) = (column=0003, value=38, timestamp=1308744931125000) --- RowKey: 726f7732 = (column=0001, value=45, timestamp=1308744931127000) = (column=0002, value=42, timestamp=1308744931128000) = (column=0003, value=33, timestamp=1308744932722000) 2 Rows Returned. [default@Test] describe keyspace; Keyspace: Test: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:1] Column Families: ColumnFamily: test Key Validation Class: org.apache.cassandra.db.marshal.BytesType Default column value validator: org.apache.cassandra.db.marshal.BytesType Columns sorted by: org.apache.cassandra.db.marshal.BytesType Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 20.0/14400 Memtable thresholds: 0.571875/122/1440 (millions of ops/MB/minutes) GC grace seconds: 864000 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: false Built indexes: [] {code} In Pig command line: {code} grunt test = LOAD 'cassandra://Test/test' USING CassandraStorage() AS (rowkey:chararray, columns: bag {T: (name:long, value:int)}); grunt value_test = foreach test generate rowkey, columns.name, columns.value; grunt dump value_test; {code} In /var/log/cassandra/system.log, I have severals time this exception: {code} INFO [IPC Server handler 3 on 8012] 2011-06-22 15:03:28,533 TaskInProgress.java (line 551) Error from attempt_201106210955_0051_m_00_3: java.lang.RuntimeException: Unexpected data type -1 found in stream. at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:478) at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541) at org.apache.pig.data.BinInterSedes.writeBag(BinInterSedes.java:522) at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:361) at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541) at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:357) at org.apache.pig.impl.io.InterRecordWriter.write(InterRecordWriter.java:73) at org.apache.pig.impl.io.InterStorage.putNext(InterStorage.java:87) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:638) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:239) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:232) at