[
https://issues.apache.org/jira/browse/MAHOUT-1660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14582678#comment-14582678
]
ASF GitHub Bot commented on MAHOUT-1660:
----------------------------------------
Github user dlyubimov commented on a diff in the pull request:
https://github.com/apache/mahout/pull/135#discussion_r32278958
--- Diff:
math-scala/src/main/scala/org/apache/mahout/math/drm/package.scala ---
@@ -115,6 +121,46 @@ package object drm {
}
}
+ /**
+ * Convert arbitrarily-keyed matrix to int-keyed matrix. Some algebra
will accept only int-numbered
+ * row matrices. So this method is to help.
+ *
+ * @param drmX input to be transcoded
+ * @param computeMap collect `old key -> int key` map to front-end?
+ * @tparam K key type
+ * @return Sequentially keyed matrix + (optionally) map from non-int key
to [[Int]] key. If the
+ * key type is actually Int, then we just return the argument
with None for the map,
+ * regardless of computeMap parameter.
+ */
+ def drm2IntKeyed[K: ClassTag](drmX: DrmLike[K], computeMap: Boolean =
false): (DrmLike[Int], Option[DrmLike[K]]) =
+ drmX.context.engine.drm2IntKeyed(drmX, computeMap)
+
+ /**
+ * (Optional) Sampling operation. Consistent with Spark semantics of the
same.
+ * @param drmX
+ * @param fraction
+ * @param replacement
+ * @tparam K
+ * @return samples
+ */
+ def drmSampleRows[K: ClassTag](drmX: DrmLike[K], fraction: Double,
replacement: Boolean = false): DrmLike[K] =
--- End diff --
Any public api, if it is not a matrix method, is package-level set of
functions. This is to follow R conventions where most of the functions are not
obviously connected to any object. I.e. we just write something like
import o.a.m.math.drm._
....
val sample = drmSampleRows(...)
> Hadoop1HDFSUtil.readDRMHEader should be taking Hadoop conf
> ----------------------------------------------------------
>
> Key: MAHOUT-1660
> URL: https://issues.apache.org/jira/browse/MAHOUT-1660
> Project: Mahout
> Issue Type: Bug
> Components: spark
> Affects Versions: 0.10.0
> Reporter: Suneel Marthi
> Assignee: Dmitriy Lyubimov
> Priority: Minor
> Fix For: 0.10.2
>
>
> Hadoop1HDFSUtil.readDRMHEader should be taking Hadoop configuration from
> Context and not ignore it
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)