[jira] [Commented] (SPARK-6246) spark-ec2 can't handle clusters with 100 nodes
[ https://issues.apache.org/jira/browse/SPARK-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551486#comment-14551486 ] Shivaram Venkataraman commented on SPARK-6246: -- [~srowen] Could you add [~alyaxey] to the developers group and assign this issue ? spark-ec2 can't handle clusters with 100 nodes Key: SPARK-6246 URL: https://issues.apache.org/jira/browse/SPARK-6246 Project: Spark Issue Type: Bug Components: EC2 Affects Versions: 1.3.0 Reporter: Nicholas Chammas Priority: Minor Fix For: 1.5.0 This appears to be a new restriction, perhaps resulting from our upgrade of boto. Maybe it's a new restriction from EC2. Not sure yet. We didn't have this issue around the Spark 1.1.0 time frame from what I can remember. I'll track down where the issue is and when it started. Attempting to launch a cluster with 100 slaves yields the following: {code} Spark AMI: ami-35b1885c Launching instances... Launched 100 slaves in us-east-1c, regid = r-9c408776 Launched master in us-east-1c, regid = r-92408778 Waiting for AWS to propagate instance metadata... Waiting for cluster to enter 'ssh-ready' state.ERROR:boto:400 Bad Request ERROR:boto:?xml version=1.0 encoding=UTF-8? ResponseErrorsErrorCodeInvalidRequest/CodeMessage101 exceeds the maximum number of instance IDs that can be specificied (100). Please specify fewer than 100 instance IDs./Message/Error/ErrorsRequestID217fd6ff-9afa-4e91-86bc-ab16fcc442d8/RequestID/Response Traceback (most recent call last): File ./ec2/spark_ec2.py, line 1338, in module main() File ./ec2/spark_ec2.py, line 1330, in main real_main() File ./ec2/spark_ec2.py, line 1170, in real_main cluster_state='ssh-ready' File ./ec2/spark_ec2.py, line 795, in wait_for_cluster_state statuses = conn.get_all_instance_status(instance_ids=[i.id for i in cluster_instances]) File /path/apache/spark/ec2/lib/boto-2.34.0/boto/ec2/connection.py, line 737, in get_all_instance_status InstanceStatusSet, verb='POST') File /path/apache/spark/ec2/lib/boto-2.34.0/boto/connection.py, line 1204, in get_object raise self.ResponseError(response.status, response.reason, body) boto.exception.EC2ResponseError: EC2ResponseError: 400 Bad Request ?xml version=1.0 encoding=UTF-8? ResponseErrorsErrorCodeInvalidRequest/CodeMessage101 exceeds the maximum number of instance IDs that can be specificied (100). Please specify fewer than 100 instance IDs./Message/Error/ErrorsRequestID217fd6ff-9afa-4e91-86bc-ab16fcc442d8/RequestID/Response {code} This problem seems to be with {{get_all_instance_status()}}, though I am not sure if other methods are affected too. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6246) spark-ec2 can't handle clusters with 100 nodes
[ https://issues.apache.org/jira/browse/SPARK-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550637#comment-14550637 ] Alex commented on SPARK-6246: - This can be fixed by replacing the line in file ec2/spark_ec2.py statuses = conn.get_all_instance_status(instance_ids=[i.id for i in cluster_instances]) with the lines: max_batch = 100 statuses = [] for j in range((len(cluster_instances) + max_batch - 1) // max_batch): statuses.extend(conn.get_all_instance_status(instance_ids=[i.id for i in cluster_instances[j * max_batch:(j + 1) * max_batch]])) spark-ec2 can't handle clusters with 100 nodes Key: SPARK-6246 URL: https://issues.apache.org/jira/browse/SPARK-6246 Project: Spark Issue Type: Bug Components: EC2 Affects Versions: 1.3.0 Reporter: Nicholas Chammas Priority: Minor This appears to be a new restriction, perhaps resulting from our upgrade of boto. Maybe it's a new restriction from EC2. Not sure yet. We didn't have this issue around the Spark 1.1.0 time frame from what I can remember. I'll track down where the issue is and when it started. Attempting to launch a cluster with 100 slaves yields the following: {code} Spark AMI: ami-35b1885c Launching instances... Launched 100 slaves in us-east-1c, regid = r-9c408776 Launched master in us-east-1c, regid = r-92408778 Waiting for AWS to propagate instance metadata... Waiting for cluster to enter 'ssh-ready' state.ERROR:boto:400 Bad Request ERROR:boto:?xml version=1.0 encoding=UTF-8? ResponseErrorsErrorCodeInvalidRequest/CodeMessage101 exceeds the maximum number of instance IDs that can be specificied (100). Please specify fewer than 100 instance IDs./Message/Error/ErrorsRequestID217fd6ff-9afa-4e91-86bc-ab16fcc442d8/RequestID/Response Traceback (most recent call last): File ./ec2/spark_ec2.py, line 1338, in module main() File ./ec2/spark_ec2.py, line 1330, in main real_main() File ./ec2/spark_ec2.py, line 1170, in real_main cluster_state='ssh-ready' File ./ec2/spark_ec2.py, line 795, in wait_for_cluster_state statuses = conn.get_all_instance_status(instance_ids=[i.id for i in cluster_instances]) File /path/apache/spark/ec2/lib/boto-2.34.0/boto/ec2/connection.py, line 737, in get_all_instance_status InstanceStatusSet, verb='POST') File /path/apache/spark/ec2/lib/boto-2.34.0/boto/connection.py, line 1204, in get_object raise self.ResponseError(response.status, response.reason, body) boto.exception.EC2ResponseError: EC2ResponseError: 400 Bad Request ?xml version=1.0 encoding=UTF-8? ResponseErrorsErrorCodeInvalidRequest/CodeMessage101 exceeds the maximum number of instance IDs that can be specificied (100). Please specify fewer than 100 instance IDs./Message/Error/ErrorsRequestID217fd6ff-9afa-4e91-86bc-ab16fcc442d8/RequestID/Response {code} This problem seems to be with {{get_all_instance_status()}}, though I am not sure if other methods are affected too. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6246) spark-ec2 can't handle clusters with 100 nodes
[ https://issues.apache.org/jira/browse/SPARK-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551157#comment-14551157 ] Alex commented on SPARK-6246: - [~shivaram] Done. This is my first PR. Do I have to do anything else to contribute to this ticket? spark-ec2 can't handle clusters with 100 nodes Key: SPARK-6246 URL: https://issues.apache.org/jira/browse/SPARK-6246 Project: Spark Issue Type: Bug Components: EC2 Affects Versions: 1.3.0 Reporter: Nicholas Chammas Priority: Minor This appears to be a new restriction, perhaps resulting from our upgrade of boto. Maybe it's a new restriction from EC2. Not sure yet. We didn't have this issue around the Spark 1.1.0 time frame from what I can remember. I'll track down where the issue is and when it started. Attempting to launch a cluster with 100 slaves yields the following: {code} Spark AMI: ami-35b1885c Launching instances... Launched 100 slaves in us-east-1c, regid = r-9c408776 Launched master in us-east-1c, regid = r-92408778 Waiting for AWS to propagate instance metadata... Waiting for cluster to enter 'ssh-ready' state.ERROR:boto:400 Bad Request ERROR:boto:?xml version=1.0 encoding=UTF-8? ResponseErrorsErrorCodeInvalidRequest/CodeMessage101 exceeds the maximum number of instance IDs that can be specificied (100). Please specify fewer than 100 instance IDs./Message/Error/ErrorsRequestID217fd6ff-9afa-4e91-86bc-ab16fcc442d8/RequestID/Response Traceback (most recent call last): File ./ec2/spark_ec2.py, line 1338, in module main() File ./ec2/spark_ec2.py, line 1330, in main real_main() File ./ec2/spark_ec2.py, line 1170, in real_main cluster_state='ssh-ready' File ./ec2/spark_ec2.py, line 795, in wait_for_cluster_state statuses = conn.get_all_instance_status(instance_ids=[i.id for i in cluster_instances]) File /path/apache/spark/ec2/lib/boto-2.34.0/boto/ec2/connection.py, line 737, in get_all_instance_status InstanceStatusSet, verb='POST') File /path/apache/spark/ec2/lib/boto-2.34.0/boto/connection.py, line 1204, in get_object raise self.ResponseError(response.status, response.reason, body) boto.exception.EC2ResponseError: EC2ResponseError: 400 Bad Request ?xml version=1.0 encoding=UTF-8? ResponseErrorsErrorCodeInvalidRequest/CodeMessage101 exceeds the maximum number of instance IDs that can be specificied (100). Please specify fewer than 100 instance IDs./Message/Error/ErrorsRequestID217fd6ff-9afa-4e91-86bc-ab16fcc442d8/RequestID/Response {code} This problem seems to be with {{get_all_instance_status()}}, though I am not sure if other methods are affected too. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6246) spark-ec2 can't handle clusters with 100 nodes
[ https://issues.apache.org/jira/browse/SPARK-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551150#comment-14551150 ] Apache Spark commented on SPARK-6246: - User 'alyaxey' has created a pull request for this issue: https://github.com/apache/spark/pull/6267 spark-ec2 can't handle clusters with 100 nodes Key: SPARK-6246 URL: https://issues.apache.org/jira/browse/SPARK-6246 Project: Spark Issue Type: Bug Components: EC2 Affects Versions: 1.3.0 Reporter: Nicholas Chammas Priority: Minor This appears to be a new restriction, perhaps resulting from our upgrade of boto. Maybe it's a new restriction from EC2. Not sure yet. We didn't have this issue around the Spark 1.1.0 time frame from what I can remember. I'll track down where the issue is and when it started. Attempting to launch a cluster with 100 slaves yields the following: {code} Spark AMI: ami-35b1885c Launching instances... Launched 100 slaves in us-east-1c, regid = r-9c408776 Launched master in us-east-1c, regid = r-92408778 Waiting for AWS to propagate instance metadata... Waiting for cluster to enter 'ssh-ready' state.ERROR:boto:400 Bad Request ERROR:boto:?xml version=1.0 encoding=UTF-8? ResponseErrorsErrorCodeInvalidRequest/CodeMessage101 exceeds the maximum number of instance IDs that can be specificied (100). Please specify fewer than 100 instance IDs./Message/Error/ErrorsRequestID217fd6ff-9afa-4e91-86bc-ab16fcc442d8/RequestID/Response Traceback (most recent call last): File ./ec2/spark_ec2.py, line 1338, in module main() File ./ec2/spark_ec2.py, line 1330, in main real_main() File ./ec2/spark_ec2.py, line 1170, in real_main cluster_state='ssh-ready' File ./ec2/spark_ec2.py, line 795, in wait_for_cluster_state statuses = conn.get_all_instance_status(instance_ids=[i.id for i in cluster_instances]) File /path/apache/spark/ec2/lib/boto-2.34.0/boto/ec2/connection.py, line 737, in get_all_instance_status InstanceStatusSet, verb='POST') File /path/apache/spark/ec2/lib/boto-2.34.0/boto/connection.py, line 1204, in get_object raise self.ResponseError(response.status, response.reason, body) boto.exception.EC2ResponseError: EC2ResponseError: 400 Bad Request ?xml version=1.0 encoding=UTF-8? ResponseErrorsErrorCodeInvalidRequest/CodeMessage101 exceeds the maximum number of instance IDs that can be specificied (100). Please specify fewer than 100 instance IDs./Message/Error/ErrorsRequestID217fd6ff-9afa-4e91-86bc-ab16fcc442d8/RequestID/Response {code} This problem seems to be with {{get_all_instance_status()}}, though I am not sure if other methods are affected too. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6246) spark-ec2 can't handle clusters with 100 nodes
[ https://issues.apache.org/jira/browse/SPARK-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550652#comment-14550652 ] Shivaram Venkataraman commented on SPARK-6246: -- Could you send a PR for this ? spark-ec2 can't handle clusters with 100 nodes Key: SPARK-6246 URL: https://issues.apache.org/jira/browse/SPARK-6246 Project: Spark Issue Type: Bug Components: EC2 Affects Versions: 1.3.0 Reporter: Nicholas Chammas Priority: Minor This appears to be a new restriction, perhaps resulting from our upgrade of boto. Maybe it's a new restriction from EC2. Not sure yet. We didn't have this issue around the Spark 1.1.0 time frame from what I can remember. I'll track down where the issue is and when it started. Attempting to launch a cluster with 100 slaves yields the following: {code} Spark AMI: ami-35b1885c Launching instances... Launched 100 slaves in us-east-1c, regid = r-9c408776 Launched master in us-east-1c, regid = r-92408778 Waiting for AWS to propagate instance metadata... Waiting for cluster to enter 'ssh-ready' state.ERROR:boto:400 Bad Request ERROR:boto:?xml version=1.0 encoding=UTF-8? ResponseErrorsErrorCodeInvalidRequest/CodeMessage101 exceeds the maximum number of instance IDs that can be specificied (100). Please specify fewer than 100 instance IDs./Message/Error/ErrorsRequestID217fd6ff-9afa-4e91-86bc-ab16fcc442d8/RequestID/Response Traceback (most recent call last): File ./ec2/spark_ec2.py, line 1338, in module main() File ./ec2/spark_ec2.py, line 1330, in main real_main() File ./ec2/spark_ec2.py, line 1170, in real_main cluster_state='ssh-ready' File ./ec2/spark_ec2.py, line 795, in wait_for_cluster_state statuses = conn.get_all_instance_status(instance_ids=[i.id for i in cluster_instances]) File /path/apache/spark/ec2/lib/boto-2.34.0/boto/ec2/connection.py, line 737, in get_all_instance_status InstanceStatusSet, verb='POST') File /path/apache/spark/ec2/lib/boto-2.34.0/boto/connection.py, line 1204, in get_object raise self.ResponseError(response.status, response.reason, body) boto.exception.EC2ResponseError: EC2ResponseError: 400 Bad Request ?xml version=1.0 encoding=UTF-8? ResponseErrorsErrorCodeInvalidRequest/CodeMessage101 exceeds the maximum number of instance IDs that can be specificied (100). Please specify fewer than 100 instance IDs./Message/Error/ErrorsRequestID217fd6ff-9afa-4e91-86bc-ab16fcc442d8/RequestID/Response {code} This problem seems to be with {{get_all_instance_status()}}, though I am not sure if other methods are affected too. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6246) spark-ec2 can't handle clusters with 100 nodes
[ https://issues.apache.org/jira/browse/SPARK-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547501#comment-14547501 ] Shivaram Venkataraman commented on SPARK-6246: -- I just ran into this problem as well. This definitely does not happen with some of the older versions of the script. spark-ec2 can't handle clusters with 100 nodes Key: SPARK-6246 URL: https://issues.apache.org/jira/browse/SPARK-6246 Project: Spark Issue Type: Bug Components: EC2 Affects Versions: 1.3.0 Reporter: Nicholas Chammas Priority: Minor This appears to be a new restriction, perhaps resulting from our upgrade of boto. Maybe it's a new restriction from EC2. Not sure yet. We didn't have this issue around the Spark 1.1.0 time frame from what I can remember. I'll track down where the issue is and when it started. Attempting to launch a cluster with 100 slaves yields the following: {code} Spark AMI: ami-35b1885c Launching instances... Launched 100 slaves in us-east-1c, regid = r-9c408776 Launched master in us-east-1c, regid = r-92408778 Waiting for AWS to propagate instance metadata... Waiting for cluster to enter 'ssh-ready' state.ERROR:boto:400 Bad Request ERROR:boto:?xml version=1.0 encoding=UTF-8? ResponseErrorsErrorCodeInvalidRequest/CodeMessage101 exceeds the maximum number of instance IDs that can be specificied (100). Please specify fewer than 100 instance IDs./Message/Error/ErrorsRequestID217fd6ff-9afa-4e91-86bc-ab16fcc442d8/RequestID/Response Traceback (most recent call last): File ./ec2/spark_ec2.py, line 1338, in module main() File ./ec2/spark_ec2.py, line 1330, in main real_main() File ./ec2/spark_ec2.py, line 1170, in real_main cluster_state='ssh-ready' File ./ec2/spark_ec2.py, line 795, in wait_for_cluster_state statuses = conn.get_all_instance_status(instance_ids=[i.id for i in cluster_instances]) File /path/apache/spark/ec2/lib/boto-2.34.0/boto/ec2/connection.py, line 737, in get_all_instance_status InstanceStatusSet, verb='POST') File /path/apache/spark/ec2/lib/boto-2.34.0/boto/connection.py, line 1204, in get_object raise self.ResponseError(response.status, response.reason, body) boto.exception.EC2ResponseError: EC2ResponseError: 400 Bad Request ?xml version=1.0 encoding=UTF-8? ResponseErrorsErrorCodeInvalidRequest/CodeMessage101 exceeds the maximum number of instance IDs that can be specificied (100). Please specify fewer than 100 instance IDs./Message/Error/ErrorsRequestID217fd6ff-9afa-4e91-86bc-ab16fcc442d8/RequestID/Response {code} This problem seems to be with {{get_all_instance_status()}}, though I am not sure if other methods are affected too. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6246) spark-ec2 can't handle clusters with 100 nodes
[ https://issues.apache.org/jira/browse/SPARK-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355403#comment-14355403 ] Shivaram Venkataraman commented on SPARK-6246: -- Hmm - This seems like a bad problem. And it looks like a AWS side change rather than a boto change I guess. [~nchammas] Similar to the EC2Box issue above, can we also batch calls to `get_instances` 100 instances at a time ? spark-ec2 can't handle clusters with 100 nodes Key: SPARK-6246 URL: https://issues.apache.org/jira/browse/SPARK-6246 Project: Spark Issue Type: Bug Components: EC2 Affects Versions: 1.3.0 Reporter: Nicholas Chammas Priority: Minor This appears to be a new restriction, perhaps resulting from our upgrade of boto. Maybe it's a new restriction from EC2. Not sure yet. We didn't have this issue around the Spark 1.1.0 time frame from what I can remember. I'll track down where the issue is and when it started. Attempting to launch a cluster with 100 slaves yields the following: {code} Spark AMI: ami-35b1885c Launching instances... Launched 100 slaves in us-east-1c, regid = r-9c408776 Launched master in us-east-1c, regid = r-92408778 Waiting for AWS to propagate instance metadata... Waiting for cluster to enter 'ssh-ready' state.ERROR:boto:400 Bad Request ERROR:boto:?xml version=1.0 encoding=UTF-8? ResponseErrorsErrorCodeInvalidRequest/CodeMessage101 exceeds the maximum number of instance IDs that can be specificied (100). Please specify fewer than 100 instance IDs./Message/Error/ErrorsRequestID217fd6ff-9afa-4e91-86bc-ab16fcc442d8/RequestID/Response Traceback (most recent call last): File ./ec2/spark_ec2.py, line 1338, in module main() File ./ec2/spark_ec2.py, line 1330, in main real_main() File ./ec2/spark_ec2.py, line 1170, in real_main cluster_state='ssh-ready' File ./ec2/spark_ec2.py, line 795, in wait_for_cluster_state statuses = conn.get_all_instance_status(instance_ids=[i.id for i in cluster_instances]) File /path/apache/spark/ec2/lib/boto-2.34.0/boto/ec2/connection.py, line 737, in get_all_instance_status InstanceStatusSet, verb='POST') File /path/apache/spark/ec2/lib/boto-2.34.0/boto/connection.py, line 1204, in get_object raise self.ResponseError(response.status, response.reason, body) boto.exception.EC2ResponseError: EC2ResponseError: 400 Bad Request ?xml version=1.0 encoding=UTF-8? ResponseErrorsErrorCodeInvalidRequest/CodeMessage101 exceeds the maximum number of instance IDs that can be specificied (100). Please specify fewer than 100 instance IDs./Message/Error/ErrorsRequestID217fd6ff-9afa-4e91-86bc-ab16fcc442d8/RequestID/Response {code} This problem seems to be with {{get_all_instance_status()}}, though I am not sure if other methods are affected too. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6246) spark-ec2 can't handle clusters with 100 nodes
[ https://issues.apache.org/jira/browse/SPARK-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354969#comment-14354969 ] Nicholas Chammas commented on SPARK-6246: - FYI [~shivaram]. spark-ec2 can't handle clusters with 100 nodes Key: SPARK-6246 URL: https://issues.apache.org/jira/browse/SPARK-6246 Project: Spark Issue Type: Bug Components: EC2 Affects Versions: 1.3.0 Reporter: Nicholas Chammas Priority: Minor This appears to be a new restriction, perhaps resulting from our upgrade of boto. Maybe it's a new restriction from EC2. Not sure yet. We didn't have this issue around the Spark 1.1.0 time frame from what I can remember. I'll track down where the issue is and when it started. Attempting to launch a cluster with 100 slaves yields the following: {code} Spark AMI: ami-35b1885c Launching instances... Launched 100 slaves in us-east-1c, regid = r-9c408776 Launched master in us-east-1c, regid = r-92408778 Waiting for AWS to propagate instance metadata... Waiting for cluster to enter 'ssh-ready' state.ERROR:boto:400 Bad Request ERROR:boto:?xml version=1.0 encoding=UTF-8? ResponseErrorsErrorCodeInvalidRequest/CodeMessage101 exceeds the maximum number of instance IDs that can be specificied (100). Please specify fewer than 100 instance IDs./Message/Error/ErrorsRequestID217fd6ff-9afa-4e91-86bc-ab16fcc442d8/RequestID/Response Traceback (most recent call last): File ./ec2/spark_ec2.py, line 1338, in module main() File ./ec2/spark_ec2.py, line 1330, in main real_main() File ./ec2/spark_ec2.py, line 1170, in real_main cluster_state='ssh-ready' File ./ec2/spark_ec2.py, line 795, in wait_for_cluster_state statuses = conn.get_all_instance_status(instance_ids=[i.id for i in cluster_instances]) File /path/apache/spark/ec2/lib/boto-2.34.0/boto/ec2/connection.py, line 737, in get_all_instance_status InstanceStatusSet, verb='POST') File /path/apache/spark/ec2/lib/boto-2.34.0/boto/connection.py, line 1204, in get_object raise self.ResponseError(response.status, response.reason, body) boto.exception.EC2ResponseError: EC2ResponseError: 400 Bad Request ?xml version=1.0 encoding=UTF-8? ResponseErrorsErrorCodeInvalidRequest/CodeMessage101 exceeds the maximum number of instance IDs that can be specificied (100). Please specify fewer than 100 instance IDs./Message/Error/ErrorsRequestID217fd6ff-9afa-4e91-86bc-ab16fcc442d8/RequestID/Response {code} This problem seems to be with {{get_all_instance_status()}}, though I am not sure if other methods are affected too. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6246) spark-ec2 can't handle clusters with 100 nodes
[ https://issues.apache.org/jira/browse/SPARK-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355004#comment-14355004 ] Sean Owen commented on SPARK-6246: -- The funny thing is, the typo in that error message (specificied) makes it easy to find some corroboration: https://github.com/skavanagh/EC2Box/issues/8 https://github.com/worksap-ate/aws-sdk/issues/139 Looks like an AWS SDK limit? spark-ec2 can't handle clusters with 100 nodes Key: SPARK-6246 URL: https://issues.apache.org/jira/browse/SPARK-6246 Project: Spark Issue Type: Bug Components: EC2 Affects Versions: 1.3.0 Reporter: Nicholas Chammas Priority: Minor This appears to be a new restriction, perhaps resulting from our upgrade of boto. Maybe it's a new restriction from EC2. Not sure yet. We didn't have this issue around the Spark 1.1.0 time frame from what I can remember. I'll track down where the issue is and when it started. Attempting to launch a cluster with 100 slaves yields the following: {code} Spark AMI: ami-35b1885c Launching instances... Launched 100 slaves in us-east-1c, regid = r-9c408776 Launched master in us-east-1c, regid = r-92408778 Waiting for AWS to propagate instance metadata... Waiting for cluster to enter 'ssh-ready' state.ERROR:boto:400 Bad Request ERROR:boto:?xml version=1.0 encoding=UTF-8? ResponseErrorsErrorCodeInvalidRequest/CodeMessage101 exceeds the maximum number of instance IDs that can be specificied (100). Please specify fewer than 100 instance IDs./Message/Error/ErrorsRequestID217fd6ff-9afa-4e91-86bc-ab16fcc442d8/RequestID/Response Traceback (most recent call last): File ./ec2/spark_ec2.py, line 1338, in module main() File ./ec2/spark_ec2.py, line 1330, in main real_main() File ./ec2/spark_ec2.py, line 1170, in real_main cluster_state='ssh-ready' File ./ec2/spark_ec2.py, line 795, in wait_for_cluster_state statuses = conn.get_all_instance_status(instance_ids=[i.id for i in cluster_instances]) File /path/apache/spark/ec2/lib/boto-2.34.0/boto/ec2/connection.py, line 737, in get_all_instance_status InstanceStatusSet, verb='POST') File /path/apache/spark/ec2/lib/boto-2.34.0/boto/connection.py, line 1204, in get_object raise self.ResponseError(response.status, response.reason, body) boto.exception.EC2ResponseError: EC2ResponseError: 400 Bad Request ?xml version=1.0 encoding=UTF-8? ResponseErrorsErrorCodeInvalidRequest/CodeMessage101 exceeds the maximum number of instance IDs that can be specificied (100). Please specify fewer than 100 instance IDs./Message/Error/ErrorsRequestID217fd6ff-9afa-4e91-86bc-ab16fcc442d8/RequestID/Response {code} This problem seems to be with {{get_all_instance_status()}}, though I am not sure if other methods are affected too. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6246) spark-ec2 can't handle clusters with 100 nodes
[ https://issues.apache.org/jira/browse/SPARK-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355642#comment-14355642 ] Nicholas Chammas commented on SPARK-6246: - I dunno, I haven't looked into the problem yet (been out all day), but I'm surprised that everything else works with 100 nodes: creating nodes, destroying them, getting them. It's just the status check call. If we have to, sure I'll batch the calls. But I suspect there's a better way to do things. I'm surprised boto doesn't just abstract this problem away. Anyway, I'll look into it and report back. spark-ec2 can't handle clusters with 100 nodes Key: SPARK-6246 URL: https://issues.apache.org/jira/browse/SPARK-6246 Project: Spark Issue Type: Bug Components: EC2 Affects Versions: 1.3.0 Reporter: Nicholas Chammas Priority: Minor This appears to be a new restriction, perhaps resulting from our upgrade of boto. Maybe it's a new restriction from EC2. Not sure yet. We didn't have this issue around the Spark 1.1.0 time frame from what I can remember. I'll track down where the issue is and when it started. Attempting to launch a cluster with 100 slaves yields the following: {code} Spark AMI: ami-35b1885c Launching instances... Launched 100 slaves in us-east-1c, regid = r-9c408776 Launched master in us-east-1c, regid = r-92408778 Waiting for AWS to propagate instance metadata... Waiting for cluster to enter 'ssh-ready' state.ERROR:boto:400 Bad Request ERROR:boto:?xml version=1.0 encoding=UTF-8? ResponseErrorsErrorCodeInvalidRequest/CodeMessage101 exceeds the maximum number of instance IDs that can be specificied (100). Please specify fewer than 100 instance IDs./Message/Error/ErrorsRequestID217fd6ff-9afa-4e91-86bc-ab16fcc442d8/RequestID/Response Traceback (most recent call last): File ./ec2/spark_ec2.py, line 1338, in module main() File ./ec2/spark_ec2.py, line 1330, in main real_main() File ./ec2/spark_ec2.py, line 1170, in real_main cluster_state='ssh-ready' File ./ec2/spark_ec2.py, line 795, in wait_for_cluster_state statuses = conn.get_all_instance_status(instance_ids=[i.id for i in cluster_instances]) File /path/apache/spark/ec2/lib/boto-2.34.0/boto/ec2/connection.py, line 737, in get_all_instance_status InstanceStatusSet, verb='POST') File /path/apache/spark/ec2/lib/boto-2.34.0/boto/connection.py, line 1204, in get_object raise self.ResponseError(response.status, response.reason, body) boto.exception.EC2ResponseError: EC2ResponseError: 400 Bad Request ?xml version=1.0 encoding=UTF-8? ResponseErrorsErrorCodeInvalidRequest/CodeMessage101 exceeds the maximum number of instance IDs that can be specificied (100). Please specify fewer than 100 instance IDs./Message/Error/ErrorsRequestID217fd6ff-9afa-4e91-86bc-ab16fcc442d8/RequestID/Response {code} This problem seems to be with {{get_all_instance_status()}}, though I am not sure if other methods are affected too. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org