[jira] [Updated] (CASSANDRA-8471) mapred/hive queries fail when there is just 1 node down RF is 1
[ https://issues.apache.org/jira/browse/CASSANDRA-8471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Kołaczkowski updated CASSANDRA-8471: -- Reviewer: Piotr Kołaczkowski mapred/hive queries fail when there is just 1 node down RF is 1 - Key: CASSANDRA-8471 URL: https://issues.apache.org/jira/browse/CASSANDRA-8471 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: Artem Aliev Labels: easyfix, hadoop, patch Fix For: 2.0.12, 2.1.3 Attachments: cassandra-2.0-8471.txt The hive and map reduce queries fail when just 1 node is down, even with RF=3 (in a 6 node cluster) and default consistency levels for Read and Write. The simpliest way to reproduce it is to use DataStax integrated hadoop environment with hive. {quote} alter keyspace HiveMetaStore WITH replication = {'class':'NetworkTopologyStrategy', 'DC1':3} ; alter keyspace cfs WITH replication = {'class':'NetworkTopologyStrategy', 'DC1':3} ; alter keyspace cfs_archive WITH replication = {'class':'NetworkTopologyStrategy', 'DC1':3} ; CREATE KEYSPACE datamart WITH replication = { 'class': 'NetworkTopologyStrategy', 'DC1': '3' }; CREATE TABLE users1 ( id int, name text, PRIMARY KEY ((id)) ) {quote} Insert data. Shutdown one cassandra node. Run map reduce task. Hive in this case {quote} $ dse hive hive use datamart; hive select count(*) from users1; {quote} {quote} ... ... 2014-12-10 18:33:53,090 Stage-1 map = 75%, reduce = 25%, Cumulative CPU 6.39 sec 2014-12-10 18:33:54,093 Stage-1 map = 75%, reduce = 25%, Cumulative CPU 6.39 sec 2014-12-10 18:33:55,096 Stage-1 map = 75%, reduce = 25%, Cumulative CPU 6.39 sec 2014-12-10 18:33:56,099 Stage-1 map = 75%, reduce = 25%, Cumulative CPU 6.39 sec 2014-12-10 18:33:57,102 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 6.39 sec MapReduce Total cumulative CPU time: 6 seconds 390 msec Ended Job = job_201412100017_0006 with errors Error during job, obtaining debugging information... Job Tracking URL: http://i-9d0306706.c.eng-gce-support.internal:50030/jobdetails.jsp?jobid=job_201412100017_0006 Examining task ID: task_201412100017_0006_m_05 (and more) from job job_201412100017_0006 Task with the most failures(4): - Task ID: task_201412100017_0006_m_01 URL: http://i-9d0306706.c.eng-gce-support.internal:50030/taskdetails.jsp?jobid=job_201412100017_0006tipid=task_201412100017_0006_m_01 - Diagnostic Messages for this Task: java.io.IOException: java.io.IOException: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: i-6ac985f7d.c.eng-gce-support.internal/10.240.124.16:9042 (com.datastax.driver.core.TransportException: [i-6ac985f7d.c.eng-gce-support.internal/10.240.124.16:9042] Cannot connect)) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:244) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:538) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:197) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:266) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.Child.main(Child.java:260) Caused by: java.io.IOException: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: i-6ac985f7d.c.eng-gce-support.internal/10.240.124.16:9042 (com.datastax.driver.core.TransportException: [i-6ac985f7d.c.eng-gce-support.internal/10.240.124.16:9042] Cannot connect)) at org.apache.hadoop.hive.cassandra.cql3.input.HiveCqlInputFormat.getRecordReader(HiveCqlInputFormat.java:206) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:241) ... 9 more Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: i-6ac985f7d.c.eng-gce-support.internal/10.240.124.16:9042 (com.datastax.driver.core.TransportException: [i-6ac985f7d.c.eng-gce-support.internal/10.240.124.16:9042]
[jira] [Updated] (CASSANDRA-8471) mapred/hive queries fail when there is just 1 node down RF is 1
[ https://issues.apache.org/jira/browse/CASSANDRA-8471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Aliev updated CASSANDRA-8471: --- Attachment: cassandra-2.0-8471.txt mapred/hive queries fail when there is just 1 node down RF is 1 - Key: CASSANDRA-8471 URL: https://issues.apache.org/jira/browse/CASSANDRA-8471 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: Artem Aliev Attachments: cassandra-2.0-8471.txt The hive and map reduce queries fail when just 1 node is down, even with RF=3 (in a 6 node cluster) and default consistency levels for Read and Write. The simpliest way to reproduce it is to use DataStax integrated hadoop environment with hive. {quote} alter keyspace HiveMetaStore WITH replication = {'class':'NetworkTopologyStrategy', 'DC1':3} ; alter keyspace cfs WITH replication = {'class':'NetworkTopologyStrategy', 'DC1':3} ; alter keyspace cfs_archive WITH replication = {'class':'NetworkTopologyStrategy', 'DC1':3} ; CREATE KEYSPACE datamart WITH replication = { 'class': 'NetworkTopologyStrategy', 'DC1': '3' }; CREATE TABLE users1 ( id int, name text, PRIMARY KEY ((id)) ) {quote} Insert data. Shutdown one cassandra node. Run map reduce task. Hive in this case {quote} $ dse hive hive use datamart; hive select count(*) from users1; {quote} {quote} ... ... 2014-12-10 18:33:53,090 Stage-1 map = 75%, reduce = 25%, Cumulative CPU 6.39 sec 2014-12-10 18:33:54,093 Stage-1 map = 75%, reduce = 25%, Cumulative CPU 6.39 sec 2014-12-10 18:33:55,096 Stage-1 map = 75%, reduce = 25%, Cumulative CPU 6.39 sec 2014-12-10 18:33:56,099 Stage-1 map = 75%, reduce = 25%, Cumulative CPU 6.39 sec 2014-12-10 18:33:57,102 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 6.39 sec MapReduce Total cumulative CPU time: 6 seconds 390 msec Ended Job = job_201412100017_0006 with errors Error during job, obtaining debugging information... Job Tracking URL: http://i-9d0306706.c.eng-gce-support.internal:50030/jobdetails.jsp?jobid=job_201412100017_0006 Examining task ID: task_201412100017_0006_m_05 (and more) from job job_201412100017_0006 Task with the most failures(4): - Task ID: task_201412100017_0006_m_01 URL: http://i-9d0306706.c.eng-gce-support.internal:50030/taskdetails.jsp?jobid=job_201412100017_0006tipid=task_201412100017_0006_m_01 - Diagnostic Messages for this Task: java.io.IOException: java.io.IOException: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: i-6ac985f7d.c.eng-gce-support.internal/10.240.124.16:9042 (com.datastax.driver.core.TransportException: [i-6ac985f7d.c.eng-gce-support.internal/10.240.124.16:9042] Cannot connect)) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:244) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:538) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:197) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:266) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.Child.main(Child.java:260) Caused by: java.io.IOException: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: i-6ac985f7d.c.eng-gce-support.internal/10.240.124.16:9042 (com.datastax.driver.core.TransportException: [i-6ac985f7d.c.eng-gce-support.internal/10.240.124.16:9042] Cannot connect)) at org.apache.hadoop.hive.cassandra.cql3.input.HiveCqlInputFormat.getRecordReader(HiveCqlInputFormat.java:206) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:241) ... 9 more Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: i-6ac985f7d.c.eng-gce-support.internal/10.240.124.16:9042 (com.datastax.driver.core.TransportException: [i-6ac985f7d.c.eng-gce-support.internal/10.240.124.16:9042] Cannot connect)) at