[GitHub] spark issue #8384: [SPARK-8510] [CORE] [PYSPARK] NumPy matrices as values in...

paberline Sat, 08 Oct 2016 11:41:10 -0700

Github user paberline commented on the issue:

    https://github.com/apache/spark/pull/8384
  
    Hi @holdenk - thanks for taking the time to look at this. I understand this 
PR 
    might be too esoteric for Spark core. I will close it once @mengxr 
    @jkbradley and @davies confirm.
    
    However, your comments yesterday on the the DataSet API JIRA issue are 
    timely: 
https://issues.apache.org/jira/browse/SPARK-12776?focusedCommentId=15557224&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15557224
    
    If the DataSet API was implemented in PySpark, then, as per the Scala API,  
    DataFrame = Dataset[Row]. Then I think this would enable me to simply 
convert each 
    matrix to a DataFrame then save the Dataset as parquet or sequence file. Is 
my understanding correct?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #8384: [SPARK-8510] [CORE] [PYSPARK] NumPy matrices as values in...

Reply via email to