GitHub user feynmanliang opened a pull request:
https://github.com/apache/spark/pull/7783
[SPARK-8998][MLlib] Distribute PrefixSpan computation for large projected
databases
Continuation of work by @zhangjiajin
Closes #7412
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/feynmanliang/spark
SPARK-8998-improve-distributed
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/7783.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #7783
----
commit 91fd7e66d0c363e68bc9ebe2bf3e03c26ef348d2
Author: zhangjiajin <[email protected]>
Date: 2015-07-07T07:30:10Z
Add new algorithm PrefixSpan and test file.
commit 575995f69dadad825d97f2248599eb62c1743fe7
Author: zhangjiajin <[email protected]>
Date: 2015-07-08T09:07:37Z
Modified the code according to the review comments.
commit 951fd424ff189f9bf5619a84f3f19e942f592396
Author: zhang jiajin <[email protected]>
Date: 2015-07-08T10:22:16Z
Delete Prefixspan.scala
Use PrefixSpan.scala instead of Prefixspan.scala. Delete Prefixspan.scala
commit a2eb14c7fb6abb70eaa046baf78da205c7a4ca7d
Author: zhang jiajin <[email protected]>
Date: 2015-07-08T10:23:31Z
Delete PrefixspanSuite.scala
Use PrefixSpanSuite.scala instead of PrefixspanSuite.scala, Delete
PrefixspanSuite.scala.
commit 89bc368f76c40ad0090a928cec49cd9d28ce666e
Author: zhangjiajin <[email protected]>
Date: 2015-07-08T10:50:38Z
Fixed a Scala style error.
commit 1dd33ad82499b9ad1b446b96f2f88519ffbe9a1b
Author: zhangjiajin <[email protected]>
Date: 2015-07-09T14:40:29Z
Modified the code according to the review comments.
commit 4c60fb36148206abd67fe51cea667ee3d63e490e
Author: zhangjiajin <[email protected]>
Date: 2015-07-09T15:01:45Z
Fix some Scala style errors.
commit ba5df346543e9aee119bd781b257860b65bbe7df
Author: zhangjiajin <[email protected]>
Date: 2015-07-09T15:10:25Z
Fix a Scala style error.
commit 574e56ccfb271d0ed86c3eba95d1a11a8688495d
Author: zhangjiajin <[email protected]>
Date: 2015-07-10T11:49:06Z
Add new object LocalPrefixSpan, and do some optimization.
commit ca9c4c8fa84202d8d533c51c277138461ba096a7
Author: zhangjiajin <[email protected]>
Date: 2015-07-11T02:40:24Z
Modified the code according to the review comments.
commit 22b0ef463beb0e0fe9cc696989245da79722a3a6
Author: zhangjiajin <[email protected]>
Date: 2015-07-14T02:21:04Z
Add feature: Collect enough frequent prefixes before projection in
PrefixSpan.
commit 078d4101f56c68c6f191de57f9e542a80f2c89b5
Author: zhangjiajin <[email protected]>
Date: 2015-07-14T02:46:05Z
fix a scala style error.
commit 4dd1c8a2393b91dc1841c3b01dad7163371dd434
Author: zhangjiajin <[email protected]>
Date: 2015-07-15T02:57:41Z
initialize file before rebase.
commit a8fde870aae9f5fe31ac04a50da20ec906626826
Author: zhangjiajin <[email protected]>
Date: 2015-07-15T03:25:34Z
Merge branch 'master' of https://github.com/apache/spark
Initilize local master branch.
commit 6560c6916edeff900e54c6b5ee5b7c44cac87724
Author: zhangjiajin <[email protected]>
Date: 2015-07-15T03:44:42Z
Add feature: Collect enough frequent prefixes before projection in
PrefixeSpan
commit baa2885681f19897cc5158f4fc9338543a55487a
Author: zhangjiajin <[email protected]>
Date: 2015-07-15T08:48:59Z
Modified the code according to the review comments.
commit 095aa3a390446205a4d22227b7ed1fbce46f2c93
Author: zhangjiajin <[email protected]>
Date: 2015-07-16T03:26:26Z
Modified the code according to the review comments.
commit b07e20c973775ee545249657416a821e90829392
Author: zhangjiajin <[email protected]>
Date: 2015-07-16T06:52:26Z
Merge branch 'master' of https://github.com/apache/spark into
CollectEnoughPrefixes
commit d2250b7871035c8096d377805ca9f9a9cf90fdd3
Author: zhangjiajin <[email protected]>
Date: 2015-07-18T10:03:37Z
remove minPatternsBeforeLocalProcessing, add
maxSuffixesBeforeLocalProcessing.
commit 64271b3f802ce1f5870209b73ff7dd0e73442076
Author: zhangjiajin <[email protected]>
Date: 2015-07-27T04:28:42Z
Modified codes according to comments.
commit 6e149fa3bd88a2347e635f03ab9ae5913e03beee
Author: Feynman Liang <[email protected]>
Date: 2015-07-28T21:36:36Z
Fix splitPrefixSuffixPairs
commit 01c9ae9aa3f09aa4ec058db024f6a2cb482570bb
Author: Feynman Liang <[email protected]>
Date: 2015-07-28T21:39:42Z
Add getters
commit cb2a4fc71d4874f3bb3cf0d8b0331e5b41f7cf45
Author: Feynman Liang <[email protected]>
Date: 2015-07-28T21:50:31Z
Inline code for readability
commit da0091b3d4d9e9d7f058b645272e70e3256c1ac7
Author: Feynman Liang <[email protected]>
Date: 2015-07-28T22:21:06Z
Use lists for prefixes to reuse data
commit 1235cfcc9367b546bcf564972a33b769f62da520
Author: Feynman Liang <[email protected]>
Date: 2015-07-28T22:30:29Z
Use Iterable[Array[_]] over Array[Array[_]] for database
commit c2caa5cb19e5c9e54dda288c9c1e7befb21feb64
Author: Feynman Liang <[email protected]>
Date: 2015-07-28T22:54:30Z
Readability improvements and comments
commit 87fa021afaa184dfbf7eafcae0beb494697c40e2
Author: Feynman Liang <[email protected]>
Date: 2015-07-28T23:19:48Z
Improve extend prefix readability
commit ad23aa9f10aaa515b5d0804b4dbbef5d75005e0f
Author: zhang jiajin <[email protected]>
Date: 2015-07-29T02:30:57Z
Merge pull request #1 from feynmanliang/SPARK-8998-collectBeforeLocal
[Spark-8998]Collect Enough Prefixes Improvements
commit 4ddf479ad95714950bde1bfd8d0dbdfa91d955c0
Author: Feynman Liang <[email protected]>
Date: 2015-07-30T05:44:51Z
Parallelize freqItemCounts
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]