[ 
https://issues.apache.org/jira/browse/SPARK-8449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14592364#comment-14592364
 ] 

Stephen Carman commented on SPARK-8449:
---------------------------------------

As per your suggestion 
https://github.com/apache/spark/pull/1290#issuecomment-113262365 upon looking 
at this more, I agree with not making it a compilation step. So I suppose we're 
gonna have to require it's building. Shame though I found out a couple things 
looking at this further...

1. The Java library is just wrappers for the C Library using the JNI, so 
compiled versions will have to be it seems platform specific.
2. These artifacts don't exist in the main maven repo, so we're gonna have to 
discuss either building them with spark or some other method of having these 
libraries available. I'm unsure what is the best path to go down here, I 
figured they would publish the artifacts, but that isn't the case. I'm hesitant 
to add a C related build step to building spark as I think that'd be beaten and 
killed with fire by anyone reading the idea.

What do you think Alex? Any other idea for proceeding with this? In the mean 
time, I'm gonna research how I can better integrate the dependencies in here.

> HDF5 read/write support for Spark MLlib
> ---------------------------------------
>
>                 Key: SPARK-8449
>                 URL: https://issues.apache.org/jira/browse/SPARK-8449
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>    Affects Versions: 1.4.0
>            Reporter: Alexander Ulanov
>             Fix For: 1.4.1
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Add support for reading and writing HDF5 file format to/from LabeledPoint. 
> HDFS and local file system have to be supported. Other Spark formats to be 
> discussed. 
> Interface proposal:
> /* path - directory path in any Hadoop-supported file system URI */
> MLUtils.saveAsHDF5(sc: SparkContext, path: String, RDD[LabeledPoint]): Unit
> /* path - file or directory path in any Hadoop-supported file system URI */
> MLUtils.loadHDF5(sc: SparkContext, path: String): RDD[LabeledPoint]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to