[ 
https://issues.apache.org/jira/browse/FLINK-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15306916#comment-15306916
 ] 

ASF GitHub Bot commented on FLINK-3919:
---------------------------------------

Github user chiwanpark commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1996#discussion_r65102444
  
    --- Diff: 
flink-libraries/flink-ml/src/main/scala/org/apache/flink/ml/math/distributed/DistributedRowMatrix.scala
 ---
    @@ -0,0 +1,167 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.flink.ml.math.distributed
    +
    +import java.lang
    +
    +import breeze.linalg.{CSCMatrix => BreezeSparseMatrix, Matrix => 
BreezeMatrix, Vector => BreezeVector}
    +import org.apache.flink.api.common.functions.RichGroupReduceFunction
    +import org.apache.flink.api.common.typeinfo.TypeInformation
    +import org.apache.flink.api.scala._
    +import org.apache.flink.ml.math.{Matrix => FlinkMatrix, _}
    +import org.apache.flink.util.Collector
    +import org.apache.flink.ml.math.Breeze._
    +import scala.collection.JavaConversions._
    +
    +/**
    +  *
    +  * @param numRowsOpt If None, will be calculated from the DataSet.
    +  * @param numColsOpt If None, will be calculated from the DataSet.
    +  */
    +class DistributedRowMatrix(data: DataSet[IndexedRow],
    +                           numRowsOpt: Option[Int] = None,
    +                           numColsOpt: Option[Int] = None)
    +    extends DistributedMatrix {
    +
    +  lazy val getNumRows: Int = numRowsOpt match {
    +    case Some(rows) => rows
    +    case None => data.count().toInt
    +  }
    +
    +  lazy val getNumCols: Int = numColsOpt match {
    +    case Some(cols) => cols
    +    case None => calcCols
    +  }
    --- End diff --
    
    Ah. okay. Let's use `Int` for indices now.
    
    About counting, we can re-use `DataSetUtils.countElementsPerPartition` 
method. 
(https://github.com/apache/flink/blob/master/flink-scala/src/main/scala/org/apache/flink/api/scala/utils/package.scala#L56)
    
    ```scala
    import org.apache.flink.scala.utils._
    
    ...
    
    lazy val numRowsInDataSet = numRowsOpt match {
      case Some(value) => data.getExecutionEnvironment.fromElements(value)
      case None => data.countElementsPerPartition.map(_._2).reduce(_ + 
_).map(_.toInt)
    }
    
    ...
    ```


> Distributed Linear Algebra: row-based matrix
> --------------------------------------------
>
>                 Key: FLINK-3919
>                 URL: https://issues.apache.org/jira/browse/FLINK-3919
>             Project: Flink
>          Issue Type: New Feature
>          Components: Machine Learning Library
>            Reporter: Simone Robutti
>            Assignee: Simone Robutti
>
> Distributed matrix implementation as a DataSet of IndexedRow and related 
> operations



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to