[
https://issues.apache.org/jira/browse/CASSANDRA-11574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15251361#comment-15251361
]
Stefania commented on CASSANDRA-11574:
--------------------------------------
bq. spark already took the available 1 core in that machine which Cassandra is
getting zero for that value. This is the main problem I guess. please let me
know if this is issue.
No I don't think so. {{get_num_processes()}} will never return zero and it uses
{{get_num_cores()}}, which relies on {{mp.cpu_count()}}, doc
[here|https://docs.python.org/2/library/multiprocessing.html]. This returns the
number of cores available on the system, I don't think it would know that Spark
has taken one. Besides, even if the system only has one core, it should still
work with 1 process. We have an environment variable that we set to simulate
1-core machines in our tests and Datastax tested COPY code on single core VMs
as well. Also, it is the cores of the machine that runs cqlsh that matter, not
the machine that runs the Cassandra server (just in case if this wasn't clear
before).
Are you still getting the exact same error with the two lines above? What about
if you don't call {{get_num_processes}} at all and fix {{num_processes}} to 1,
does that work?
Full code here:
{code}
@staticmethod
def get_num_processes(cap):
"""
Pick a reasonable number of child processes. We need to leave at
least one core for the parent or feeder process.
"""
return max(1, min(cap, CopyTask.get_num_cores() - 1))
@staticmethod
def get_num_cores():
"""
Return the number of cores if available. If the test environment
variable
is set, then return the number carried by this variable. This is to
test single-core
machine more easily.
"""
try:
num_cores_for_testing = os.environ.get('CQLSH_COPY_TEST_NUM_CORES',
'')
ret = int(num_cores_for_testing) if num_cores_for_testing else
mp.cpu_count()
printdebugmsg("Detected %d core(s)" % (ret,))
return ret
except NotImplementedError:
printdebugmsg("Failed to detect number of cores, returning 1")
return 1
{code}
> COPY FROM command in cqlsh throws error
> ---------------------------------------
>
> Key: CASSANDRA-11574
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11574
> Project: Cassandra
> Issue Type: Bug
> Components: CQL
> Environment: Operating System: Ubuntu Server 14.04
> JDK: Oracle JDK 8 update 77
> Python: 2.7.6
> Reporter: Mahafuzur Rahman
> Assignee: Stefania
> Fix For: 3.0.6
>
>
> Any COPY FROM command in cqlsh is throwing the following error:
> "get_num_processes() takes no keyword arguments"
> Example command:
> COPY inboxdata
> (to_user_id,to_user_network,created_time,attachments,from_user_id,from_user_name,from_user_network,id,message,to_user_name,updated_time)
> FROM 'inbox.csv';
> Similar commands worked parfectly in the previous versions such as 3.0.4
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)