[ 
https://issues.apache.org/jira/browse/CASSANDRA-11574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15251361#comment-15251361
 ] 

Stefania commented on CASSANDRA-11574:
--------------------------------------

bq. spark already took the available 1 core in that machine which Cassandra is 
getting zero for that value. This is the main problem I guess. please let me 
know if this is issue.

No I don't think so. {{get_num_processes()}} will never return zero and it uses 
{{get_num_cores()}}, which relies on {{mp.cpu_count()}}, doc 
[here|https://docs.python.org/2/library/multiprocessing.html]. This returns the 
number of cores available on the system, I don't think it would know that Spark 
has taken one.  Besides, even if the system only has one core, it should still 
work with 1 process. We have an environment variable that we set to simulate 
1-core machines in our tests and Datastax tested COPY code on single core VMs 
as well. Also, it is the cores of the machine that runs cqlsh that matter, not 
the machine that runs the Cassandra server (just in case if this wasn't clear 
before). 

Are you still getting the exact same error with the two lines above? What about 
if you don't call {{get_num_processes}} at all and fix {{num_processes}} to 1, 
does that work?

Full code here:

{code}
@staticmethod
    def get_num_processes(cap):
        """
        Pick a reasonable number of child processes. We need to leave at
        least one core for the parent or feeder process.
        """
        return max(1, min(cap, CopyTask.get_num_cores() - 1))

    @staticmethod
    def get_num_cores():
        """
        Return the number of cores if available. If the test environment 
variable
        is set, then return the number carried by this variable. This is to 
test single-core
        machine more easily.
        """
        try:
            num_cores_for_testing = os.environ.get('CQLSH_COPY_TEST_NUM_CORES', 
'')
            ret = int(num_cores_for_testing) if num_cores_for_testing else 
mp.cpu_count()
            printdebugmsg("Detected %d core(s)" % (ret,))
            return ret
        except NotImplementedError:
            printdebugmsg("Failed to detect number of cores, returning 1")
            return 1
{code}

> COPY FROM command in cqlsh throws error
> ---------------------------------------
>
>                 Key: CASSANDRA-11574
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11574
>             Project: Cassandra
>          Issue Type: Bug
>          Components: CQL
>         Environment: Operating System: Ubuntu Server 14.04
> JDK: Oracle JDK 8 update 77
> Python: 2.7.6
>            Reporter: Mahafuzur Rahman
>            Assignee: Stefania
>             Fix For: 3.0.6
>
>
> Any COPY FROM command in cqlsh is throwing the following error:
> "get_num_processes() takes no keyword arguments"
> Example command: 
> COPY inboxdata 
> (to_user_id,to_user_network,created_time,attachments,from_user_id,from_user_name,from_user_network,id,message,to_user_name,updated_time)
>  FROM 'inbox.csv';
> Similar commands worked parfectly in the previous versions such as 3.0.4



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to