[ https://issues.apache.org/jira/browse/SPARK-8485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594122#comment-14594122 ]
Joseph K. Bradley commented on SPARK-8485: ------------------------------------------ This is something which is going to come up in MLlib now that we have a better interface for feature transformers. I suspect a lot of people will look to Pipelines for existing transformers, including in major applications areas like NLP, vision, and audio. I think some of these are clearly useful (SIFT & HOG are the ones I hear most about). For others, it would be good to look to other libraries and see what is most common. My feeling is that it would be nice to have a few such transformers in MLlib itself, but a full-fledged image processing library would belong in an external package for now. My main concerns are: * Interest/need: We should hold off on implementing these to see if the community has sufficient interest. * Data type: If we add image processing, we need to support actual images, including depth (data type) and multiple channels (e.g. RGB). This will be a significant commitment to create a UDT for images, but it would be important to lay the groundwork for further image processing work. Let's leave the JIRAs open for discussion to gather interest, use cases with Spark, and feedback. But people should discuss here before sending PRs. > Feature transformers for image processing > ----------------------------------------- > > Key: SPARK-8485 > URL: https://issues.apache.org/jira/browse/SPARK-8485 > Project: Spark > Issue Type: New Feature > Components: ML > Reporter: Feynman Liang > > Many transformers exist to convert from image representations into more > compact descriptors amenable to standard ML techniques. We should implement > these transformers in Spark to support machine learning on richer content > types. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org