[ https://issues.apache.org/jira/browse/SPARK-4036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15148099#comment-15148099 ]
Qian Huang commented on SPARK-4036: ----------------------------------- Hi, I have created a spark package, http://spark-packages.org/package/hqzizania/crf-spark. I co-work on it with hujiayin. [~josephkb] It can be said that this package is a spark-based re-implementation of crf++. It has the same limit as crf++ has, like feature generator design and only for "segmenting/labeling sequential data". But it basically meets the requirement of NLP and can run in parallel for big data. Welcome to try it. If you encounter bugs, feel free to submit an issue or pull request. > Add Conditional Random Fields (CRF) algorithm to Spark MLlib > ------------------------------------------------------------ > > Key: SPARK-4036 > URL: https://issues.apache.org/jira/browse/SPARK-4036 > Project: Spark > Issue Type: New Feature > Components: MLlib > Reporter: Guoqiang Li > Assignee: Kai Sasaki > Attachments: CRF_design.1.pdf, crf-spark.zip, > dig-hair-eye-train.model, features.hair-eye, sample-input, sample-output > > > Conditional random fields (CRFs) are a class of statistical modelling method > often applied in pattern recognition and machine learning, where they are > used for structured prediction. > The paper: > http://www.seas.upenn.edu/~strctlrn/bib/PDF/crf.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org